The finite state Machines

Theory of computation (TOC) is one of the most important concept in computer science. It lays a strong foundation for abstract areas in computer science and teaches you about the elementary ways in which a computer can be made to think.

The finite-state machine is a branch of TOC. It is a mathematical model of computation or an abstract machine that can be in exactly one of a finite number of states at any given time.



They change states according to the given input. And at one time they are in only one state. One interesting fact is that they don't have any memory like RAM, ROM and do not store any previous information other than the state they are in. Still they do the processing.

The use of FSMs can be observed in many devices in the modern world that perform a predetermined sequence of tasks depending on the sequence of input provided. Examples are vending machines, which dispense products when the proper combination of coins is deposited, elevators, whose sequence of stops is determined by the floors requested by riders, traffic lights, which change sequence when cars are waiting, and combination locks, which require the input of combination numbers in the proper order.

Now let’s look at how earlier mechanical device i.e. A Parking lot Machine used FSMs for processing.

The parking lot machine


The machine is based on the principle of FSM. It accepts only three coins 5, 10 and 20. They are the inputs to the FSM. The machine has six states. When the user puts coins worth 25 units the machine prints a receipt.

The Finite State Automata is shown below.



As we can see the states of the machines are {0, 5, 10, 15, 20, and 25}. Initially it is at zero, when it reaches to 25 it outputs a receipt to the user.
The combinations of the coins that it accepts are:
  • 5, 5, 5, 5, 5
  • 20, 5
  • 10, 10, 5
  • 5, 5, 10, 10, 5 etc.

That’s how they were use in the earlier Mechanical devices. But why do we still study them now as electronic microchips do the same job? Particularly, why do we study them as a part of computer science?

FSM in Computer Science


All machines/systems can be generalized as a FSM. Every system in computer science works in a same way. It receives some input, does the processing and gives output.



After each processing the system goes into another state, same state with loop or maybe final state.
Now let’s look at its very common application in Computer Science, a comment filtering system. Every programming language has comments. They are very useful at the time of coding but they aren’t compiled to the final binary code. So the system has to perform some filtering during compilation. FSM of such a system is given below. It filters the comments in a C/C++ code.



The format of comments in C and C++ are:

C: /* some comment here… */
C++: // some comment here

The above given FSM is embedded at a particular location in the compiler to escape the comments. According to the inputs (‘/’, ‘*’, ‘end-of-line) it moves to another state. When characters of the comments are found the state goes on a loop until it ends. Finally the control of the system returns to the ‘none’ state.

Another common example is the validation of variables/identifiers in programming language. An identifier consists of a combination of symbols, numbers and alphabets.

Some of the valid identifiers are:
  • hello;
  • hello123;
  • hello_123;
And some of the invalid ones:
  • 123hello; //identifiers cannot start with numbers
  • 123@34; // @ symbol is invalid

A FSM can be constructed with certain restrictions to filter invalid identifiers. Not only identifiers, it can be used for any other string matching operations too.
As we have seen, they are useful it a lot of ways and serve as a basic model of computation in computer science. They are also used in games for AI, implementations of user input handling, navigating menu screens, and parsing text and network protocols.


References:
https://www.ics.uci.edu/~eppstein/161/960222.html
https://en.wikipedia.org/wiki/Finite-state_machine
https://www.youtube.com/watch?v=vhiiia1_hC4

Automatic summarization using Text Rank

Automatic summarization is the technique of shortening a text document with computer program to create a summary with the major ideas of the original content.

There are two types of automatic summarization: extraction and abstraction. Extractive methods work by selecting some existing words, phrases, or sentences from the original text to form the summary. On the other hand abstractive use natural language processing techniques to create a summary that is similar to what a human being can generate.

Here we are discussing about the simplest form extracting method i.e. Text Rank Algorithm in JavaScript. It selects the key sentence from a paragraph to generate the summary.

The basic steps are:
  • Split content into paragraphs.
  • From each paragraph choose the most suitable sentence i.e. sentence having highest rank.
  • Join all top ranked sentences to form the summary.
Let’s look at each step in detail.


Split content into paragraph

As all words in a sentences are separated by whitespace and new lines by ‘\n’, the paragraphs are separated by double new lines ‘\n\n’.

function split_into_sentences(s){
  var s = s.split('.');
  return(s);
}

From each paragraph choose the most suitable sentence

So what is the most suitable sentence? Well, it the one with the highest rank. The sentence with highest rank is the one which has more common words in that paragraphs.
E.g.
His name is Bibhuti. And he is a person who writes code. Also, he write a blog.  I code in Java Script too.
Here, the second sentence has the highest rank because it has more similar words i.e. “a”, “is”, “code”, “writes”.

To determine the rank, a dictionary (hash table) for each paragraph is needed. Its key will be the sentence itself and the value will be the count the intersection between the sentence and other sentences in the paragraphs.

The intersection is counted as:

function sentences_insersection(sent1, sent2){
  var s1 = set(sent1.split(" "));
  var s2 = set(sent2.split(" "));
  if(s1.length + s2.length === 0) return 0;
  return(intersection(s1, s2)/((s1.length+s2.length)/2));
}

But first let’s make a 2D array which will store the intersections for each paragraph with each other.

var sentences = split_into_sentences(content);
  var len = sentences.length;  
  var values = [];
  for(var i=0; i<len; i++){
    values.push([]);
    for(var j=0; j<len; j++){
      values[i].push(null);
      values[i][j] = sentences_insersection(sentences[i], sentences[j]);
    }    
  }

Then convert the 2d array into dictionary. Here the code i == j is for escaping the repetitions. The score is calculated by summing all intersections of a sentence with other sentence of the paragraph.

var dict = {};
  for(var i=0; i<len; i++){   
    var score = 0; 
    for(var j=0; j<len; j++){
      if(i === j) continue;
      score += values[i][j]; 
    }
    dict[format_sentence(sentences[i])] = score;
  }

The dictionary for the above example looks like:

 Alsohewriteablog:0.5666666666666667
 Andheisapersonwhowritescode:0.8205128205128205
 HisnameisBibhuti:0.15384615384615385
 IcodeinJavaScript:0.43333333333333335

From the dictionary choose, the sentence which has the maximum rank. i.e. And he is a person who writes code.

function get_best_sentence(paragraph, dict){
  var sentences = split_into_sentences(paragraph);
  //ignore short sentences
  if(sentences.length < 2) return "";
  //get best sentences according to sentence dictonary
  var best_sentence = "";
  var max_value = 0;
  for(var i = 0; i<sentences.length; i++){
    var s = sentences[i];
    var strip = format_sentence(s);
    if(dict[strip] > max_value){
      max_value = dict[strip];
      best_sentence = s;
    }
  }
  return best_sentence;
}

In similar manner all top ranked sentences can be chosen from all paragraphs to form a summary. The finished application is here.

The source code is on github at https://github.com/bibhuticoder/summaryjs.

Static Portfolio for Programmers

As a programmer we may need a portfolio as a showcase for our projects, resume or maybe blogs. If you got money, just buy a cheap hosting and you are good to go. What if you don’t and you are searching for the best and easiest way?Here I will try to explain how I made my static portfolio and blog. For understanding the concepts, an intermediate knowledge of web development is recommended.

Hosting:
As a programmer most of us are familiar with GitHub. If not “GitHub is a Web-based Git version control repository hosting service”. It provides unlimited pubic repositories (projects) and a GitHub page for each user. Also gives a domain name username.github.io. For that you need to create a repository with name “your_username.github.io” and it need to have an index.html file. If you need help visithttps://pages.github.com.

Projects:
The portfolio obviously contains your list of your projects/researches. Although it is possible to manually add/edit the projects every time, it becomes a problem. So, we can use GitHub itself to store the projects along with its description and use that information in our website. For that we need to use GitHub REST Api V3. You can get the list of projects in JSON format by simply making a GET request.Eg:https://api.github.com/users/bibhuticoder/reposGo to this link and you shall see my list of projects in JSON. With AJAX the list can be rendered easily by JavaScript.

Assets:
Now, to store your assets i.e. CSS, JS, images you can simply use GitHub or maybe other file sharing sites like Google drive and Dropbox.

Blog:
Most of the programmers combine their blogs and portfolio into one. It can also be done with Blogger API v3.There are two ways to access the API i.e. by using OAuth or simply using an API key. The API key method is best for creating a read-only blogs. The complete procedure is best explained at https://developers.google.com/blogger/.
Moreover, a Frontend JS Framework like Angular, React, Vue etc. can be used to make a Single Page portfolio. Other services like Firebase also provide the same facility for web hosting and data storage for free but it has its own limitations.

That's how I made my portfolio and my blog. Visit https://bibhuticoder.github.io to have a look. Also the source code is at https://github.com/bibhuticoder/bibhuticoder.github.io