Improve your NLP, because language impacts your Business!

I’ve sat in enough business meetings at my work (I can’t mention the name, but it’s #1 most valuable company in the world) to know one thing: The real king of the Internet is the one who can create an exceptional experience for people in their native language.

Natural Language Processing is primarily Limited to English

Today, Computer Scientists have developed amazing solutions and algorithms to serve the world. However, the ‘world’ that’s being served are primarily English-speakers, which is a problem. To be more specific, Search Engines such as Google/Bing/Yahoo are great for us English speakers, but in terms of design, they are horrible. These Search Engines have one of the best algorithms for English-speakers, utilizing the most advanced Natural Language Processing (NLP) solutions. Universities and their respective Computer Science departments are investing heavily in identifying better methods of improving the interaction between Humans and Computers through languages; however, when it comes to foreign languages, it’s an epic fail.

Steps in conducting NLP Analysis: Lexical Analysis, Syntactic Analysis, Semantic Analysis, Disclosure Integration, Pragmatic Analysis

Steps currently used to conduct NLP analysis

It’s no surprise that Baidu exists because their NLP algorithm caters the Chinese market far better than Google’s, and it’s only a matter of time before someone implements Hindi NLP algorithm to serve the second largest population in the world.

Search Engine Distribution for China between 2016 and 2017. Baidu is losing grounds to new search engines in China, including sm.cn and so.com. Google grew it's market share as well, but it's dwarfed by its Chinese competitors

Looking at Referrer Type

The following is a comparison of two sites by examining their referrer sources: one site is served in English, the other site in a non-English language. The length of the analysis is 3 months long (One Quarter)

Let’s go briefly through the definition:

  • Referrer means the source in which the visitor arrived to the site.
  • Search represents Search Engines, such as Google, Bing, Yahoo, etc.
  • Email represents email campaigns to the site for the given time period.
  • Social represents social media traffic to the site. For example, Facebook, Twitter, etc.
  • Other Sites represents websites that do not fall in any of the above categories. This includes newspapers, blogs, personal sites, etc.
  • Typed/Direct represents traffic that entered the site via bookmark, or if the user typed the URL in.

As represented in the graph, the primary driver to the non-English site are Other Sites, like Blogs and Newspapers, whereas the Search traffic only serves 27% of the users who speak in the non-English language. As a business, where millions are invested in Search Engine Optimization (SEO), this represents a huge business problem.

Two pie charts representing visits to English Site and to non-English site. The English site generated 41% traffic from search, whereas the non-English site generated 27% from search. Majority of users from the non-English site are arriving via other sites, such as blogs and news to visit the site.

A solution some businesses are implementing to go around this problem is devising an app that provides all the necessary experiences to its users. A company that’s utilizing this business model is WeChat, a phenomenally amazing company that’s growing so rapidly. WeChat enables users to do nearly everything through its app, whether its purchasing items, calling in a cab, calling your friends, chatting with friends, etc. It’s what Facebook is trying to implement with Messenger, but the go-to market that WeChat is targeting makes it far more impactful.

This solution implemented by WeChat solves a lot of problems – instead of ‘searching’ for the product, you can find it in WeChat’s popular product listings. It’s only 6 days prior to the posting of this blog that WeChat introduced it’s own Search Engine, which may completely change the game as users who primarily use WeChat will only use the available Search Engine available on the App itself.

Combining the Power of Local and International Language

As a member of the Baha’i, I advocate for a single universal language (for now, it’s English based on how economies run), but to also serve people’s mother tongue, that’s gold! In the Baha’i Faith, these actions are highly encouraged, and valuable not only to improvement of technology, but to also narrow the gap between people by breaking the barriers of language.

A scene where a girl is saying "Why don't we have both!". The next scene shows a group of people rejoicing and picking her up while the girl is super happy!

Be part of the change! Study NLP

If you’re a linguist, and looking to create a massive impact in this world and help advance humanity and technology, study NLP and Computer Science. Even educating people to not discount different languages, and realizing its impact on the bottomline, will help people understand the significance in drawing the power of the word.

Why is Haskell’s Website Awesome?

I love computer science and programming languages, and I usually go over to different websites that describe their computer language – whether it’s Python, Ruby, JavaScript, PHP, etc. Today I visited Haskell’s website, and it blew my mind on how well they designed the landing page.
 
When you land on homepages of some of the popular computer science languages (like C/C++/PERL/PHP, etc.), you get bombarded with useless things like latest edits/update to the syntax or code, some unknown/complex syntax that you might use once in a blue moon, conferences, and some news that’s highly irrelevant for a beginner coder.
 
Most of these websites do not address the subject of “Hey new person, how can we help you get started in this language immediately?”
My assumption, based on the behavior on StackOverflow, the largest traffic drivers to those programming sites are for documentation purposes. And most of the time, those who want documentation are not familiar with the language, like me, and visit either the parent site, or roam around StackOverflow, in order to find an answer to a coding problem.
After solving some of the coding problems, I sometimes wishes there were good basic tutorials to help programmers get started immediately without the hassle of going through StackOverflow or copy-pasting code.
 
While browsing Reddit, I stumbled upon this graph at r/dataisbeautiful that showed the behavior of users on StackOverflow throughout the day. As you can see, during the evening, many Haskell programmers lurk on StackOverflow, which piqued my curiosity about Haskell. So I decided to investigate what Haskell is, and learn what’s going on.

 

I stumbled upon Haskell’s website, and really really loved their landing page which is a straight up introduction and tutorial on how to use Haskell! For beginners, it helps them to pick up coding quickly using Haskell, for more advanced developers, they can see all the components and details as they go through the tutorial.

Another cool feature of Haskell’s landing page is the URL updates as you go through the tutorial – making it easy to pick up where you left off.

Here’s Haskell’s website if you’d like to try out their tutorial: https://www.haskell.org/
 
If Education in the field of Computer Science should be represented in some form of way, this is it 🙂 Engage the user immediately through an interactive learning environment.