OIL is a web-based database of words and short phrases that are commonly used in navigating web pages; words like Enter, Register, Search, Go, Next, Previous, Home, Browser, Webmaster, etc. and expressions like Click here, Contact Us, Download the latest version, This site is under construction, Frequently Asked Questions, etc. We hope to collect the idiomatic terminology of the Internet, which is rapidly evolving, e.g., the formal Portuguese correio elettronico becoming correio-e on the analogy of English e-mail.
We want to build a resource which web site developers, with minimal language skills, can use to demonstrate a multilingual web site to their clients. OIL is designed to be built by translators and localizers working from dozens of locales, with editing done directly on the web. The contents of the OIL database will be exportable and downloadable free in several formats - a text file, a spreadsheet format, or the entire database file. We are currently studying Trados and SDLX translation memory systems to see whether we might export data as a TMX file.
Firstly, the web site developers, who might open their clients' eyes to a world market. Then the clients, if a world presence helps with their web publishing objectives. But ultimately, the community of translators and localizers, who will be getting much more work. This is because translation of the site's core content must be done by language professionals.
In the very short term we need localizers from the standard FIGSP languages to help us translate the OIL web site pages, after which we will publicize the site more broadly.
We have entered terms in the Lexicon database so far in the standard FIGSP languages by a mix of machine translation and surfing the web to find popular terminology. What we need are native speakers who reside in a locale or who surf the web in the locale so they can attest to current practices there.
There are hundreds of excellent glossaries, and many superb bilingual and multilingual dictionaries. The OIL site will be a portal to the best of these. Only a few have as their special subject area the Internet and Web. The trilingual English-Spanish-French Internet glossary by the Canadian Bureau of Translation is an excellent example. But our goal is the most basic terms, and quite a few phrases, in dozens of locales.
A big difference from most glossaries is that the OIL terms are in a database that is web accessible and editable over the web by a community of specialists who want to help create such a resource. The current architecture allows a new locale to be generated from the term lists already present in the database. Thus, if we found a localizer who wanted to specialize the terms for French-Canada, we can add a locale for fr-CA and ask the software to initialize it with the hundreds of terms from standard French. This open web-based development makes it possible for native speakers familiar with the evolving web in their cultures to keep the dictionary fresh and relevant in Internet time.
Our database-backed system is designed to handle hundreds of locales, where most other multilingual glossaries do not easily scale beyond a few languages. The OIL web site is not just the Open Internet Lexicon. It's not just tools and resources for making a web site multilingual. It's also a working demonstration of what a multilingual web site can be. The other projects talk about multilingual. We are multilingual. Finally, the Open Internet Lexicon is really open and free for all to use.
A major secondary goal of the OIL web site is to demonstrate to web developers how web sites should respond to browser requests in different languages. Very few web sites, even the best known multilingual service vendors, respond to a browser request for a specific language with a page in that language. All the big sites offer multilingual versions, but as hyperlinks from their home pages. The OIL site will serve a Portuguese page if your browser asks for Portuguese. As we build up the number of supported locales, we hope to make OIL the most multilingual site on the web. At the moment, with fourteen localized versions, Berlitz.com appears to be the leader in languages (while responding only in English). Currently, Google appears to respond to browser requests in the largest number of languages.
A few other small features will be added soon. One is a database of language and country names in all languages. This will allow the web developer to populate drop-down menus in the native languages. Another is a database of the names and abbreviations for all the time zones, again in all their localized versions. This allows the site to refer to times globally with the correct name for the date and time, not just the local presentation formats. We also have a database of the daylight savings time rules around the world. These are all tools that we hope will be useful to web site globalizers.
In the long term, we have a number of big problems to solve. We need to support a much wider range of languages, including Asian languages. This will mean either double-byte character sets or Unicode. We are inclined to do everything in Unicode if possible.
OIL is one of three demonstration web sites sponsored by skyBuilders.com. The others are www.openserverpages.com and www.opendatabasemodel.com. These sites share a goal of open-source-code tools for what we call "community computers." In the "post-PC" world, we see much of our computing activity being done on community computers at our companies and the associations and organizations to which we belong. A file in a notebook or desktop computer is not easily shared.
skyBuilders timelines (the unusual lower-case words with medial capitals are a convention from programming variable names) is a suite of tools that lets an organization put files in a shared space we call "theSky," which we think of as a "virtual platform." Users have secure web browser access to the community computer web server from anywhere. The current applications in the suite, now at beta version 0.98, are web publishing (content management), tasks management, events scheduling, web-based presentations, interest group communications, resources management (reservations), and online databased forms for surveys, questionnaires, e-learning, etc. Our beta test sites are currently schools, community television centers, sporting clubs, film producers, and theatrical groups.
An important application for skyBuilders timelines will be multilingual web site development. Our timelines skyServer can serve time-variants of a generic page (for example a sequence of news pages scheduled for different days), and also language/locale variants. The "timeLined pages" can be worked on now but scheduled for future publication at a date after which all the language variants have been localized and proofed for quality. With scheduling information visible on the web to the human translators anywhere, simultaneous release of multiple language versions of web pages should become a lot easier to manage.
All skyPages are saved in a persistent archive with a unique naming convention that follows W3C suggestions, so that an organization's web site never need have a broken link. All the skyPages are easily accessible, and form a permanent record of the organization's web activity. As more and more work flows over the web, we are capable of preserving everything ever published. We like to say that "we build institutional memory."
Our skyWriter interface will allow a localizer anywhere in the world to edit a page in place on the web site, without files transfers, email attachments, format conversions, etc. skyPages are version controlled so that the site has infinite backup/undo. The skyEvents and skyTasks management components let a globalization manager oversee the process and share elements of workflow and page scheduling with everyone in the organization - depending on seven permission levels - from read-only to administrator.
Our plan is to give our localizers their own web pages on the OIL site. That way they will see how easy it can be editing and localizing over the web. They may also help us shake a few more bugs out of skyBuilders timeLines beta software.
We hope to make Open Internet Lexicon a supersite of hyperlink references to multilingual resources. Many other fine sites do this and very well. We hope to do as well, and we link to every one of them we know about. Our skyBuilders architecture allows visitors to add comments to every page. We will moderate these comments and use them to improve the page. In time, we hope Open Internet Lexicon will be the first place localizers and web developers turn for information on how to build multilingual web sites.
There are many multi-lingual sites, but most ask you to choose your language from their home page. Few respond to your browser settings for a preferred language. As we mentioned, a good example is Google, the search engine, which responds to requests in a dozen languages. Another who does is Alis Technologies, though only in French and English.
We do not know if their multilingual response is built in to their architecture as it is in skyBuilders skyServer. They may be using an Apache 2.0 web server, whose default installation supports several languages.
Microsoft Internet Information Server needs server-side programming (like skyBuilders) or an additional component to serve multilingual pages automatically. planetarySales.com $499 Locale Recognizer is such a component. It enables browser language requests to be negotiated with the server. Their site currently responds in three languages, English, German, and Japanese.
Do you know of some exemplary multilingual sites that we should list? Send us their URLs.
The Lernout & Hauspie and Berlitz GlobalNET web pages are available with brief amounts of aligned text in many languages. Logos has about 10 languages. None of these respond to the browser language request.
Bill Dunlap's Global Reach site is localized in nine languages.
Arlene H. Rinaldi's Netiquette User Guidelines is available in ten languages.
Kevin Werbach's Bare Bones Guide to HTML has been translated into more than twenty languages. Multilingual names for all the HTML elements are here.
Other HTML manuals in various languages (by different authors) contain lots of web terminology.
Bill Dunlap's Global Internet Statistics page reports regularly on the number of web users worldwide browsing in different languages. Also available are details by country and the growth of language populations on the web since 1996 with future projections.
Send us your questions, or add a comment (below) to this page.