Author

I am Joannes Vermorel, founder at Lokad. I am also an engineer from the Corps des Mines who initially graduated from the ENS.

I have been passionate about computer science, software matters and data mining for almost two decades. (RSS - ATOM)

Meta

Tuesday
Apr042006

Building a safe community for online translation works

This article focuses on the various issues related to online trust for freelance translation jobs and the various solutions adopted by freelance websites in this domain. As for all online activities, trust is a difficult yet critical element to obtain. My personal experience in this domain comes from the management of the PeopleWords website.

The naive approach: the rating system

Most freelance websites provide a rating system for all users (PeopleWords is no exception, see [1]). After each translation job, the customer can rate the translator and vice-versa. On the long run, competent translators and reliable customers accumulate a large amount of positive evaluations. The trust is build on those evaluations. Yet the downside of the rating system is the initialization phase. This part should not be neglected because every customer or translator has to go through that phase in the first place (no matter matter how "established" the website can be). Most of the websites that I am aware of don't do much beside their rating system. For example, since PeopleWords is just starting at this moment, nobody has positive evaluations (well, I do have positive evaluations, see what the the translators say about me, but being the website operator myself, it does not count much). I will start with a short analysis of the risks involved for unrated translators or customers. The analysis is followed by an evaluation of a few approaches that can be used to solve those issues.

Risk analysis on the translator side

For example, on PeopleWords, a translator is paid only on translated documents delivery. Therefore there is a risk for the translator to accept a job and not get paid in this end. The important question is What is the risk of accepting a job from an unrated customer?. My personal opinion on that matter is not much. I am not saying that dishonest customers do not exist, I am just saying that I believe them to be quite infrequent, probably not more than a few percents. Why?

• It's hard, as a customer, not to reveal your identity through the documents to be translated. In this aspect, the situation is not symmetric between customers and translators. The online translator can easily disguises his identity, but the task is much harder for the online customer. As a crook, your identity is certainly the last thing you would like to reveal to your victims.

• Almost no translation job comes as a one shot. If you need a translator now, then, most probably, you will need a translator again in a near future. Therefore it is more interesting, even on a sole financial basis, to have a stable customer/translator relationship rather than trying to crook random translators again and again. Also the web is not a very forgiving place, if you crook somebody, this person is probably not going to sue you (at least not for \$100) but this person can hurt your reputation in spreading the word through forums or blogs. Additionally if the job is a large and obvious one shot then the translator can ask for a fragmentation of the payments as the work goes on.

• As a customer, what you obtain from the translator has no financial value for anybody but yourself or your company. This point should not be overlooked. Indeed, if I refuse to pay an online vendor on eBay after receiving an ordered product, I can actually re-sell this product and obtain some cash from it. But in case of a translation job, I have no simple way to turn a translated document into cash.

As a side argument, I would add that online freelancers are well-armed against online frauds. Professional freelancers are usually quite experienced in the art of dealing with "remote" people. Exactly like experienced web users easily spot phishing attacks, experienced freelancers can probably identify most of the online scam.

Risk analysis on the customer

Typical freelance scam: Through many years experience, I translate all documents in many languages. All domains. Lowest prices. No delay. Please contact me at vishnu123@gmail.com.

As a customer, since you pay only on documents delivery, there are not risks, right? Wrong, there are a lot of risks. Worse, the customer may not even realize that he has been crooked before the document is actually published in one form or an other.

• As a customer, you have no way to check the quality (or even the content) of the translated documents. How can I check that the translator just didn't use a software to produce a abysmal translation of my documents? This point is very important, because it means that crooks can actually turn translation jobs into cash through sabotage.

• The costs involved through missed deadlines (or through poor translation qualities) can

also vastly exceed the translation costs. In other words, if the customer does not get the translated documents within a given deadline (or if the quality is poor), the delay (or the bad impact) can cost him much more that the actual translator fees.

The translation sabotage is really a major issue. On PeopleWords, I have identified that

roughly 20% of the translation offers are just scam. This problem is far from being specific to PeopleWords. On most of the website that I have tried, the scam rate was, in my experience, always roughly the same (actually, I would say, the more "established" the website, the higher the scam rate). The fact that customers are, on average, much less experienced than freelancers to deal with scammers is only aggravating the problem.

Solution n'1 that does not work: Screening job offers for pre-approval

Most of the freelance websites actually screen all offers for a pre-approval by their staff before publishing the offers. One obvious drawback of such an approach is the additional business day of delay to get a job offer published. The second drawback is that such a screening process requires human resources that are paid, in the end, by the customers.

There are basically two kinds of abusive job offers. This first kind is just plain classical spam (the content of the offer is simply totally irrelevant to any translation job). This kind of abuse is annoying but mostly harmless for the freelance translators. The second kind of abuse is the case of the never-paying customers that continuously create new accounts under different names to crook freelance translators on a regular basis.

Unfortunately, the second kind of abuse is never going to be filtered because no matter how dedicated are the website operators, the offer "looks" right (there is nothing wrong with the actual offer content).

My opinion is that many websites rely on such a pre-approval step because it gives them a (false) sensation of control. But clearly this practice does not add any value either for the customer (whose job offers are delayed) or for the translators (because real abuse won't be filtered).

Solution n'2 that does not work: Escrow account

Some websites propose an escrow account where the customer can put the money at the beginning of the job. When the job is done, the customer releases the amount frozen in the escrow account. In case of disagreement, a neutral party is called to "solve" the case. The important question is Does the escrow account reduce the overall risks or costs? My conclusions are actually quite counter-intuitive: I believe that the escrow account increases the risks and the costs for both customers and translators.

But if the escrow account provides no advantage for either the customer or the freelancer, why does it even exists? Why web designer would have ever bothered to implement such a system if it was the case?

My opinion (discussed below) is that customers and freelancers have no interest into relying on escrow accounts; but there is one person that has actually a strong financial interest in such a system: the website operator himself. Indeed, the "frozen" money on the escrow account is not frozen for everyone, the money is actually earning interests for the profit of the website

operator. Moreover that money is actually providing a large amount of cash to run the website to the detriment of both customers and translators. In case of non-disagreement, the escrow account is clearly just a financial overhead for both of them.

Let us see now what happen in case of disagreement. The disagreement can only occurs if the customer refuses to release the money from the escrow account. Why would the customer do that? Case A: the customer is a crook, he wants his money back. Case B: the translator is a crook, the translation does not match the quality requirements. In any case, the customer, being honest or not, will claim that the translated documents do not meet some "quality" requirements. For the neutral party who is supposed to "solve" the case, it's going to be tough; because unless you have 200 translators covering all technical domains in all languages, you have no (reliable) way to tell if the customer is honest or not.

Worse, the neutral party has no interest to solve the case (well, no interest to solve it as fast as possible). The longer it takes to solve the case, the more profitable it is because the money is earning interests on neutral party bank account. At the end, what happen? Since I tend to pay freelance translators when I require their services, I have never experienced myself the "disagreement resolution" part of the system; therefore the following is just a wild guess of my own (remember that we are considering people with no previous ratings). In case of disagreement, the website operator will wait first and then he will take a random decision (like splitting the amount of money in two and sending 50% back to the customer). Why a random decision? Because, as a website operator, you cannot financially afford a "wise" decision to each disagreements. It is not possible to pay a reliable offline translators to evaluate disagreements when your profit margin is lower than 5%. As a consequence, only option available to the website operator is just to take random (or quasi-random) decisions.

Filtering abusive translators: a solution that might work

Abusive translation offers is the Number One issue for customers and translators alike. Why? Because abusive translators

(i.e. scammers) hurt the freelance system as a whole. Being myself a customer of freelance translation services, I have always a feeling of walking through a minefield when I browse freelance translator offers. As I said, roughly 20% of the offers definitively look very suspicious. As a direct consequence, I would guess that most customers will simply be very reluctant to hazard a translation job over the web because of the "minefield" answers that you get.

The situation is not desperate though. The good news is that liars tend to be greedy and lazy too. When I started to browse the system logs of PeopleWords (that include the times, the languages, the number of offers made by each translators), I realized that scammers where basically all following the same patterns: they answer all open translation job offers in less then 10 mins. Languages do not matter (scammers can do anything ...), specialized technical documents are never an issue and no matter the amount of work it can always be done in 48h. As I said, scammers are too greedy to actually resist the urge of answering ALL open translation job offers, and scammers are too lazy to actually create accounts with ad hoc profiles that would actually match the translations jobs.

What should be done with those of scammers? Banning them is a bad idea. Indeed, once banned, the scammer will just come back under a different name, and we are back to the starting point. Therefore, in PeopleWords, I have opted for a more twisted solution, scammers are just shadowed. Since the scammer is not banned, he can continue to browse all translation offers, he can also continue to post offers too. The only difference being that offers send through a shadowed account are never visible from the customer side... This system is not perfect, a scammer can create customer account, log in, post a job, log out, log in again, post a offer, log out, log into the customer account again, and check if his offer is visible. But it's a time-consuming process and scammers are already too lazy to provide a user profile that would simply match the translation jobs anyway.

I believe that this approach has a large added value for both the customers and the translators. Indeed, those scams are obvious only from a website operator viewpoint (that can easily spot translators making inconsistent offers), but would probably less than obvious for the customer that do not have access the website logs.

Filtering abusive customers: a solution that might work

As I discussed here above, the most "dangerous" kind of abusive customers are not spammers (those are just annoying), it's the never-paying customers seeking "free" translations, crooking each time a different translator. Since no particular patterns (at least no obvious patterns) can be used to distinguish such a customer, it is not possible to "catch" them before they actually crook at least one translator. Moreover, since the customer ends up with a negative evaluation after crooking a translator, he never use the same account twice. Ratings are just useless to deal with those people.

The approach used by PeopleWords consists in relying on the only people that are actually able detect the fraud: the translators themselves. In order to leverage the "distributed" knowledge, PeopleWords includes a system dedicated to abuse reports. Freelancers can actually help the website to prevent such a customer crooking other freelancers.

What should be done with those crooks? Again, banning them is a bad idea because they will come back. I am still currently hesitating between several options to deal with such users. The option currently in use consists in shadowing their offers. A shadowed offer is not visible anymore by the translators. The only issue being they will quickly realize that something is wrong if they do not get any offers; because when you propose a translation freelance job, you always get offers. Therefore, as an additional option, I am thinking of using the scammers list but for the reverse purpose. Instead of displaying the legit translator offers, only the scam offers are displayed to such customers. This option is so twisted I am not sure to actually ever use it, but using scammers to fight back never-paying customers is still a very interesting perspective.

Notes

[1] PeopleWords only provides binary evaluations: I am satisfied or I am not satisfied (plus an additional but optional comment). This system is very rough compared to most of the other websites. For example, Guru.com provides no less than 10,000 possible rating combinations after each job (4 evaluation criterions ranging from 1 to 10 involve indeed 10,000 combinations). If PeopleWords has such a simple system, it is not due to the lack of fund (PeopleWords is dramatically under-funded) but a design choice. If tomorrow, PeopleWords provides 20 evaluation criterions each one of them ranging from 1 to 100, will the PeopleWords ratings become 10 times more accurate than Guru's ratings? Certainly not. As a customer, I have no clear idea of the "quality" of the translation job anyway. I can use MS Word spell checker to see if the translator has left many obvious spelling mistakes, but I can't really do more. As a translator, there is, usually, not much to say about the customer either. My personal opinion is that providing more complicated rating systems actually decrease the quality of the ratings. Indeed, if the system is not totally idiot proof, then users will start to do mistakes simply because they do not understand the system. For example, if you can evaluate somebody on a scale ranging from 1 to 10, what is the meaning of 10? In the German educational system, 1 is the best grade. In the French (primary) educational system, 10 is the best grade. Some other countries, like the USA, prefer the use of letters. Bottom-line: do not try to be smart, you will just confuse your users.

Friday
Feb102006

When numerical precision can hurt you

The objective was to cure a very deadly disease and the drug was tested on mice. The results were impressive since 33% of the mice survived while only 33% died (the last mouse escaped and its outcome was unknown).

Numerical precision depends on the underlying number type. In .Net, there are 3 choices float (32bits), double (64bits) and decimal (128bits). Performance left aside, more precision cannot hurt, right?

My answer is It depends. If the only purpose of your number is to be processed by a machine, then fine, more precision never hurts. But what if a user is supposed to read that number? I did actually encounter this issue while working on a project of mine Re-Dox, reduced design of experiments (an online analytical software). In terms of usability, provide the maximal numerical precision to the user is definitively a very poor idea. Does adding twelve digits to the result of 10/3 = 3.333333333333 makes it more readeable? definitively not.

A very insteresting issue while design analytical software (i.e. software performing some kind of data analysis) is to choose the right number of digits. Smart rounding can be defined as an approach that seeks to provide all significant, but only significant, digits to the user. Although, the notion of "significant" digits is very dependant of the context and carries a lot of uncertainties. Therefore, for the software designer, smart rounding is more likely to be a tradeoff between usability and user requirements.

Providing general rules for smart rounding is hard. But here are the two heuristics that I am using. Both of them rely on user inputs to define the level of precision required. Key insight: since it's usually not possible to know the accuracy requirements beforehand, the only reliable source of information is the actual user inputs.

Heuristic 1 = the number of digits in your outputs must not exceed the number of digits of user input by more than 1 or 2. Ex: If the user input 0.123 then provides a 4 or 5 digits rounding. Caution, do not take the user inputs "as such", because they can include a lot of dummy digits (ex: the user can cut and past values that look like 10.0000, where the digits is zero and implicitely not significant). The underlying idea is "no algorithm ever creates any information, an algorithm only transform the information".

Heuristic 2 = increase the number of digits of the heuristic 1 by a number equal to CeillingOf(log10(N)/2) where N is the number of data inputs. Actually, this formula is simply an interpretation of the Central Limit Theorem (Wikipedia) for the purpose of smart-rounding. Why the need for such bizarre heuristic? The underlying idea is slightly more complicated here. Basically, no matter how you combine the data inputs, the rate of accuracy improvement is bounded. The bound provided here corresponds (somehow) to an "optimistic" approach where the accuracy increase at the maximal possible speed.

Page 1 2 3