- Introduction to Algorithms
- Environment Setup
- Career Prospect of studying algorithm and data structures
- Algorithms Basics
- Asymptotic Analysis
- Greedy Algorithms
- Divide and Conquer Approach: An Insight
- Dynamic Programming
Introduction to Algorithms
All of us, irrespective of whether we have majored in computer science or not, have come across the term ALGORITHM. So now, what exactly does this word mean? Let me discuss it with you in a very elusive manner. I will start explaining it with the scratch level, and then I will go to the veteran level. Being from a core computer science background, I have got this opportunity to give you a glimpse into the world of algorithms and data structures. So, let’s begin without any further formal introduction.
When there were no PCs, there were Algos (abbreviation for the algorithm), and when there are PCs, there are much more algorithms. Algorithms are only a proper arrangement of guidelines that help a framework or an individual break the issues and examine them part by part and then build up many numerical directions to get those issues addressed. At that point, comprehend this reality that algorithms and center rationale building is the mother of all current-day math. With the beginning of a specific technique to settle an issue, there came rationales, and with the beginning of rationales, there came algorithms. When there came algorithms, there came formal dialects to take care of such issues with the assistance of projects called programming dialects in this day and age. One thing you can consider is that Algorithms are the bedrock of any programming language. At the point when you cook bread from a recipe, you’re following a calculation. At the point when you weave a sweater from an example, you’re following a calculation. At the point when you put a sharp edge on a piece of rock by executing an exact succession of hits with the finish of a prong—a critical advance in making fine stone apparatuses—you’re following a calculation. Algorithms have been a piece of human innovation since the time of the Stone Age.
Broadly looking through the lens of computer science, algorithms can teach us about the nature of the human mind, the meaning of rationality, and the oldest question of all: how to live. Examining cognition to solve the fundamental computational problems posed by our environment can utterly change the way we think about human rationality.
The notion that studying the inner workings of computers might reveal how to think and decide, what to believe and how to behave might strike many people as not only wildly reductive but, in fact, misguided. Even if computer science did have things to say about how to think and act, would we want to listen? We look at the AI and robots of science fiction, and it seems like theirs is not a life any of us would want to live. Alan Turing (one of the greatest computer science, often informally called as newton of Computer Science) defined the very notion of computation by an analogy to a human mathematician who carefully works through the steps of a lengthy calculation, yielding an unmistakably right answer. Many people say that studying algorithms is somewhat similar to learning to code, and they often refer to the algorithm as pseudo-code. Yes, that is true up to a certain extent, but studying algorithms is not the same as learning to write code. Learning to write an algorithm means you have laid the code’s foundation, and now the skyscraper (our code) is ready to be implemented. Also, there is a common myth among most young people who are taking computer science nowadays. They tend to believe that “Coding is computer science”. If someone tries to say that computer science is all about programming, please don’t believe him/her. It’s a myth. A part of computer science is concerned with programming, but computer science has a lot more to do with mathematics and logic than programming.
Buying a widget at the store is governed by dynamics described by economics. We can use economics to answer questions like “why was the widget priced the way it is?” or “why does this store stock widgets in the first place?” But it’s a stretch to say that participation in the economy is doing economics. Similarly, when you input code into your computer, the way that your compiler or interpreter takes the code you wrote and does stuff with it is computer science. Sometimes your code’s numeric variables are stored as floating points, and sometimes those floating points that ostensibly should be equivalent aren’t equivalent; the reason is explained by computer science. But that doesn’t mean coding is computer science. Now, let us explore a little more into the world of algorithms.
Also Read: Data Structures & Algorithm using Java a Beginners Guide
Environment Setup
So now that it has been clear that algorithms are very crucial in our life for practical problem solving, I will give you a few practical examples to help you grasp the importance of algorithms more. If you think that learning algorithms would only help you get past the university exams, think again. Algorithms are such things, which are involved in our daily lives in some other manner. To make it further clear, consider the following cases.
1. If, suppose, you need to lease a house, and consequently, you start to investigate a couple of houses to check if it is appropriate for you. After investigating a few houses, you find that none of them are suitable for you. So now the inquiry emerges, would it be advisable for you to consider booking a house from the ones you have just investigated, or you will take some additional time and investigate more houses? This is an intense issue that any regular individual may look at, at some point in their life. This issue might be considered in some other way too, suppose you are searching for a reasonable match to a wedding. Presently, would it be a good idea for you to go through your whole time on earth in looking through the ideal match?
Or on the other hand, there is a finish to the inquiry. On the off chance that there is an end, at that point, what is that end? How might you realize that now its opportunity to quit looking and settle with any of the reasonable matches that you have so far investigated? Thus, here is the response to this inquiry. This, regarding software engineering, is called EXPLORE EXPLOIT DILEMMA. In this way, our entire life depends on investigating the misuse of complex arrangements. You go to any shop and attempt to buy some pants. Would it be a good idea for you to settle in the wake of attempting a couple of sets and, in the long run getting one, or would it be a good idea for you to investigate more choices to improve one?
The answer is simple, you should explore precisely up to 37% of the total possibilities, and then you should think of settling off (technically exploit) with the already explored options without wasting further time in the hope of getting a better option. (This 37% came from 1/e value, and I am not getting into the derivation of this solution as that is highly technical, and it may be out of scope for the average reader). So, next time when you are confused whether you should look for more matches for the marriage or try to exploit the already settled options, be sure that if you have ten potential suitors, then try exactly four suitors (roughly 37% of 10, rounded to nearest whole number) dating, and then settle with someone among them. This applies to all fields of life, and it is proven by research that it gives the most optimal results.
2. When you have limited space inside a bag, you are inside a shopping complex where each item costs the same amount. You are asked to pick a bag full of items. So now, how should you pick the items? How should you decide which one to choose and in how much quantity, such that you can get the maximum benefit out of the offer?
This problem is explained beautifully by Cryptographers Ralph Merkle and Martin Hellman in 1978. This is called a KNAPSACK PROBLEM. It tells you that when you have a fixed bag or container and n different objects of varied values, how you should plan to take those items and how much quantity such that your profit is maximum.
3. Consider the following situation. You want to visit your uncle who is staying a little far from your city. You need to buy some milk for your dog. You need to visit the temple while coming back. You have a grocery list to take a few goods for your household. So, my question is, how would you plan your visit to each place and return to your house, and in doing so, your fuel consumption should be the minimum? This has been explained by Christofides–Serdyukov algorithm to solve the TRAVELLING SALESMAN PROBLEM. Originally, a salesman has to visit “n” different cities and has to come back to his hometown at the minimum cost. You can apply it in any realm of life.
4. Now consider the fact that you have five jobs to perform, and each one is equally important. How would you start doing each one of them? Should you start with the shortest job first (SJF ALGORITHM), or should you take up the more heavy job first? (LJF ALGORITHM) Or you should try doing each one of them for a fixed interval of time, and in this way, finish everything in instalments? (ROUND ROBBIN ALGORITHM) In computer science, we have something called SCHEDULING ALGORITHMS. There are numerous algorithms for scheduling CPU processes, and each one of them is crucial to implement in real life. A task scheduling algorithm is usually based on genetic algorithms (GA) to allocate and enforce tasks specific to the application. The use of a genetic algorithm to distribute tasks and schedules seeks more and more attention from scholars. To solve resource scheduling difficulty in large-scale, nonlinear cluster systems and has achieved ideal effects, GA has been widely applied. Making reasonable use of computing resources that creates total and average time to complete a shorter and smaller task costing an important issue. Research spectacles that artificial methods can achieve further optimal load balancing than traditional approaches.
5. Algorithms have travelled a long path to reach the stage where we are today. Modern-day algorithmic analysis is still growing and is expanding rapidly. As more and more companies are becoming data-driven, machine learning algorithms are gaining more importance. Most of the decisions that are being taken today are based on data. “Data” is 21st century’s oil, they say, and true enough, it is. Most industries are now putting in efforts to automate their processes by using technologies like computer vision, which helps the computer view any object and collect the basic data for processing. Artificial intelligence decides what to do next, and machine learning teaches computers how to make smart decision from the processed data. There are a lot of high-tech projects such as hyperloop, Tesla automatic cars, and more. The health sector has extremely good Artificial intelligence applications where nowadays a machine can track whether certain damaged cells are cancerous or not. Frankly speaking, the medical industry and health care sector has got the most benefits of the present day of Machine intelligence. Cybercrime analysis and fraud detection have been controlled.
6. There was a time when people started to develop a need for computers, which could be trained using some particular type of algorithms, called DEEP LEARNING ALGORITHMS. When computer training was first started, we used to train them with the help of large data sets. We had data sets to train computers about different concepts. Let’s say we have to identify an alphabet “P”. If there are 100 different ways to write “P”, the computer was trained to identify it as the letter “P”. But eventually, we realized that no matter how large the data set is, there are chances that the computer won’t interpret the letter. As computational power and computational space is a limited resource (we don’t have a computer with infinite processing speed and infinite RAM) and to train a machine for a complicated pattern, we needed more space. So that’s why we started exploring newer ways to train a machine, where a machine, with a set of guidelines, will be able to train itself without actual human intervention. Neural networks were first proposed in 1944 by Warren McCullough and Walter Pitts, two University of Chicago researchers who moved to MIT in 1952 as founding members of what’s sometimes called the first cognitive science department. The same amount of research was going on in the Soviet Union. Eventually, Russian mathematician Alexey Ivakhnenko, in the mid-1960s, developed a technique that in today’s era we call an artificial neural network (ANN). The name itself suggests, its idea has been taken from the brain cells and the neural network that God has built into our bodies.
Career Prospect of Studying Algorithm and Data Structures
Everyone must be thinking that it’s okay to make our daily life a lot easier by studying algorithms, but then, how is it possible to make a career out of it? How can I earn a living if I choose to study algorithms? What are the career prospects that are available for us? Let me answer all your questions one by one.
You can become a very good academician, a researcher and of course, a faculty of core computer science.
There do exist venues in an industry where research by itself is a specially enthusiastic activity, and researchers are treated very respectfully. They are the well-known industry CS “labs”. A few years ago, the big five were: Microsoft Labs, IBM Labs, Sun Labs and HP Labs, NOKIA labs. They were shining examples of how research could be embedded in an industrial setting, advancing the state of the art while also having a positive impact on the company’s bottom line. But, they were clearly delineated from the rest of the company and essentially served as an in-house CS department. One of the core activities of such labs was “technology transfer” so that the cool things they came up with could be disseminated to the rest of the company.
Algorithms Basics
The basics of algorithms lie in the fact that we choose to solve the problem statement. IN the upcoming sections, we shall see how different approaches help us design different writing algorithms whose end goals are the same, to get a solution to the problem. To understand the basics of writing an efficient algorithm, we must keep a few points in our minds-
- The algorithm should have less or minimum time-space complexity.
- The algorithm should be easy to be framed into a code of any language. It has to be easily implemented.
- There should be a well-defined approach, and that particular approach should be followed throughout the algorithm. If an algorithm is following a greedy approach, it should not switch to some other approach halfway. If a single approach is not followed, then chances are there that at the end, the solution would have a discrepancy, which would result in incorrect results.
Keeping these points in mind, let’s see what the primary approach of algorithm building is.
Asymptotic Analysis
Before coming to the asymptotic analysis, I would like to discuss the time-space complexity. In any computer program or any algorithm, this term is quite often used. What does it typically? It means, in a situation of limited time (where a user would be waiting to get the reply), limited space (where we have limited RAM and hard disk), we need to use the computational resources very carefully.
Hence, we need to check how much of these resources are being consumed. Therefore, if a program is consuming too much of these resources, we should try to find out (if possible) how to get an optimum solution for reducing the time-space complexity. Time-space complexity is a trade-off. You can write an algorithm that has a significantly less number of lines in the code during implementation. Then, it may result that the time complexity is huge (for example, in the case of recurring functions and recurring programs), it’s even possible that with very little time complexity, the algorithm is taking very high space (in the case of iterative programs, if the number of iterations is too large then such thing is quite possible). So, we need to find a perfect balance that should neither consume too much time nor take up too much space. So, this particular concept is called time-space complexity. Now before going to the next topic, I would like to tell a fundamental point.
If we are finding the upper bound or the lower bound of an algorithm, then it is usually after a certain input. Before that input, the function curve can have any fluctuations even beyond the upper bound and even lower than the lower bound. This particular input value, which is denoted by N, and beyond this input value, our main calculation would begin. Since the algorithms have the main fluctuations in the initial values, N is an input value at the starting range of the inputs. After that, when the curve is somewhat stable, we apply all the concepts to find the upper bound and the lower bound.
For every quantity, we usually have an upper limit and one lower limit. Be it anything, we as human beings tend to know the range between which any quantity lies. There are maximum profit estimates in any business and maximum run rate in a match of cricket. So, computers are no exception. Whenever a computer performs any task, we tend to think that in the worst-case scenario, what will be the time taken by the computer to complete the given task. In that sense, we need some parameter to define the upper bound and the lower bound (which would signify the best case scenario). Also, it’s not always so unfortunate that it would be the worst-case scenario, neither is it the case that it’s always the best case. Technically, we need some cases that would tell us the average case scenario for which the computer would tell us about how much time it would require to get the output if the given input is somewhat in between the best case input and the worst-case input. Now, we have another parameter which would be used if we need to find the computational complexity. Let’s start one by one to explore what these concepts are and how they are implemented in each case.
- THE BEST CASE SCENARIO – This is also known as the big omega (Ω). This describes the lower limit of any performance curve. The best case scenario will take the least time to get the function or program to be executed. By looking at the lower bound of the performance curve, a computer scientist can easily determine what is the minimum time required when everything goes fine. This will give a clear idea, that, beyond this, the algorithm or the program won’t be possible to optimise further. But, in the real world scenario, the best cases occur very less. It is not something which is very frequently used for all the cases.
- THE WORST CASE SCENARIO- It is the case when the inputs given are totally against the working flow of the algorithm. The worst case scenario will come into play, when we need to consider the worst case scenario, that if all the inputs are lengthy, complex inputs and even complicated, does the computer have any choice to say, “this is too much, I am sorry I can’t handle this!”, no, right? That means, no matter how many complex inputs are given, the computer has got no choice but to execute it. Seems perfect? Just as we human beings, when we are asked to calculate the value of 2+2, we can do it in seconds, but then, if the input given to us is, let’s say, 1+xn=1+nx1!+nn-1×22!+…where x=6? Definitely it will take at least a few minutes, or even more than that to evaluate the exact answer. So, the computational time and computational complexity depends upon how simple or how complicated the input that has been given to us. So, as the old saying goes, prepare for the worst, be ready for the best. So in the same way, computer scientists evaluate the worst case scenario, just to get an idea, whether or not the worst case is too bad to be handled. At times, the worst case scenario is so frustrating, that it may take a few days to get the results, yes, I am not joking, programs which have exponential time space complexity, and the input given is too long, then it could result in a deadlock type of scenario. So now, the worst case scenario needs to be calculated to get the estimate. It is usually denoted by the big O.
- THE AVERAGE CASE SCENARIO- So finally, we have something, which we can refer to for most of the common uses, which are not highly technical. When someone asks you about how much time it takes to reach Mumbai from Goa, then obviously, we would tend to give an average estimate. It may be somewhat more or somewhat less than that. If there is no traffic, we can have the best case. If there is heavy traffic, we can have the worst case, but what if there is somewhat moderate traffic? So here comes our last parameter to calculate the time-space complexity of any algorithm. This is denoted by big theta (Ө). At times, when there are problems that are NP-hard, which means that there is no particular solution to the given problem statement. There are many possible solutions. That’s why they are called non -deterministic and non-polynomial problems. So in those cases, the worst time is obviously something that can give you a shock, and the best case also doesn’t work every time because best case inputs are very less likely to occur, and hence, this notation is especially useful for NP-hard cases of problems.
Examples of asymptotic analysis:
Let’s say a very popular search algorithm is a binary search.
In a binary search, the search is conducted by slicing the array into two parts, finding the middle element, and then searching each half for the key element or the target element. Now, let’s check what can be the best-case scenario and what could be the worst-case scenario for this case.
Best Case scenario-
If the element that has to be found is at the middle position already, then there is no need to further carry out all the other operations. It is a clear indication that we have reached the target, and hence the program execution would stop, and the compiler would print the result. So, in this case, if we analyse the time-space complexity, then it is only one iteration, just slicing the array into two parts, and it would take only constant time. So it is 0(1).
The worst-case scenario-
In the worst-case scenario, the element may be present at the corner and can only be tracked if we continue narrowing the main array by slicing it into halves. By the end of the search, it would have the time complexity of log(n), where n is the total number of elements present in the array.
The average case scenario-
In the average case, it is the mean of the best case and the worst case. So the average time complexity is 0(log(n)).
Greedy Algorithms
It is a class of algorithms that provides very efficient solutions to certain real-life problems. Let’s see what it is and how it works. So, as the name itself suggests, greedy, which means it always has some intention to maximize a particular parameter called profit. By the term profit, I don’t mean that it is something related to money or wealth. In computer science, profit maximization implies that, when we solve a particular problem, we want to optimize anyone, or maybe two or sometimes even more parameters and get the best out of it. Let’s understand this point first. In a greedy algorithm approach, the immediate sub-problem is solved first with the goal or target of maximizing it to the fullest or finding the most optimal solution that is ever possible for that particular subproblem. Again, in the next part of the subproblem, the program’s sole target would be to fetch the best possible result to get the best. And this process will continue until the whole problem is solved by solving all the subparts to the best. The solution of each subpart can be considered as the local optimum. When several local optima are gathered together, then they are all clubbed together into a single global optimum solution, which is our final solution in the given problem statement.
Examples of greedy algorithms, a brief case study:
Let’s take a few famous greedy algorithms to understand this concept better-
1. 0/1 knapsack or fractional knapsack problem
This is the problem that we have come across numerous times. A lot of people have wondered if there is any possible solution to this problem. The problem statement is, “ you are given certain objects having certain weights, and they are of certain values each. Now, you have a sack referred to as a knapsack in the problem, with a fixed capacity. Now, in what amount, each object has to be taken into the knapsack such that you have the maximum benefit.”
This particular problem can be thought of as you have a fixed salary, and you went shopping, and you have to buy certain items for your household. How you would do your shopping such that all the items are taken, and you benefit the most. Sounds interesting, right? This problem is solved using a greedy approach by taking the profit/price(weight) ratio and then filling those articles first into the sack, which has this ratio highest. The next highest and then the next highest and so on until the sack is full and there is no space to take anything.
2. Djikstras algorithm
This is another very famous algorithm, which calculated the shortest distance in any minimum spanning tree. This will minimise the cost of travelling from one node of the graph to the other node of the graph and then calculate the least cost to travel to the destination.
3. Prims algorithm
This finds a minimum spanning tree when any spanning tree is given. This is very useful as spanning trees are extensively used for various technical purposes in core computer science.
4. Kruskals algorithm
Kruskal’s algorithm also is a great example of a greedy algorithm, and it will also create a minimum spanning tree from a given graph or spanning tree by choosing one edge at a time and doing the best at that particular instance of time, hence maximising the profit.
So these are a few famous greedy algorithms. So you must be thinking, what is the benefit of using greedy algorithms?
- Very few tradeoffs, which makes it ideal for any optimization techniques.
- It gives the best possible solution very fast, without any complications.
- In networking principles, there are many protocols, like OSPF, and few others, making use of greedy algorithms to make sure that minimum time is spent in the network by the packet.
Also Read: How to choose the right programming language for Data Science?
Divide and Conquer Approach: An Insight
So the basics of algorithms begin with how we can summarize our problem statement and then write a set of instructions to get started. How do we summarize our problem? There are many ways to do it. The first and foremost thing is, divide the problem into a few subproblems, then independently work on each subproblem to get a result for each part. And when all the subparts are independently solved, then try to sum up the whole. This is called a divide and conquer strategy. To understand it better, this is the same way, which helped the colonial powers to rule our country. Standing united makes it difficult to get the goals achieved. Similarly, an algorithm specialist first sees whether the problem can be classified into a set of further subproblems. Now, depending upon the style of summing up the solutions of the subproblems, we have two different types of techniques again:
- Top to down approach- This is often referred to as a recursive function approach. We take the end condition and then write some mathematical equations called recursive functions to get to the function’s base. Someone may ask at this instance how we would come to know when to stop the recursive equation? The answer is simple: there is a base condition in every problem. We need to stop at the base condition. Hence we solve the complete problem. To understand the problem, let’s take a positive Fibonacci series. Then we all know that the positive Fibonacci sequence is not defined for negative numbers. Hence, the base condition would be the 0th term and the 1st term. As we would calculate the nth term, we would write recursive equations until we reach the last two terms, the 0th and 1st term, which is the base condition.
- Bottom-up approach- This means that we write iterative equations, typically to avoid writing recursive functions, which I would explain further why it is done so. As of now, we take a single instance or, in technical terms, a module, and write a loop or some iterative statement and then as we know the limit upto which the subproblem is extending, we run the iterative loop up to that limit to get a full-fledged solution of the problem. So, in this way, our purpose gets solved.
Now, coming back to the main problem, why do we hesitate to use the recursive function unless there is no other option? So the answer is, when we write a recursive equation, we need a particular data structure called a stack (which we shall see later) to maintain and store all the values of each recursive equation. So, in this manner, we give rise to the requirement of extra space, which will give rise to more space complexity. Hence, it is always advisable to try to use iterative implementation of algorithms unless it is very necessary. Now, let’s see how we get an idea, how much time and space a particular algorithm would take up while we implement any algorithm.
Examples of Divide and Conquer:
- Merge sort
- Binary sort
- Quick sort
- Strassen’s algorithm, for matrix multiplication
Dynamic Programming
Initially developed by Richard Bellman in the 1950s, it is one of the most extensively used mathematical approaches to solve a program by dividing into certain smaller sections. So, there are certain times when we detect few patterns, which are being repeated continuously in the problem statement. What we generally do, is we store those values into a table, similar to a hash table, and then whenever required in future. It’s like you have saved the contact number on your smartphone. And then, whenever needed, you just open your contact book and dial the number from there, instead of manually entering each digit into the call dialer. So, this method is beneficial when we have a large number of algorithms, and then, we don’t want to unnecessarily waste time calculating the same thing repeatedly.
To solve any problem using dynamic programming, we usually the following two steps-
- Finding the optimal substructure –All the dynamic programming problems are solved using recurring equations. So now, how is a recurrent equation derived? If there is some pattern or some part of the problem is repeating, then that pattern can be studied and put into some mathematical equation such that we can utilise that pattern and use it to solve the problem statement as a whole. This is called the optimal substructure in technical language. The optimal substructure is a repeating sequence of a problem, which can be framed into recurring functions. By solving the optimal substructure, we can obtain the solution to the problem statement.
- Framing the recursive equation- Once the optimal substructure is constructed, we will frame the recursive function to solve it.
Few examples of dynamic programming approach–
- FLOYD WARSHALL’S ALGORITHM
- FINDING LONGEST COMMON SUBSET
- FINDING THE SOLUTION TO THE TRAVELLING SALESMAN PROBLEM
So this is how we need to frame our solution.
This brings us to the end of the blog on Introduction to Algorithms: Uses and Types. Thanks for reading this blog, and we hope you were able to gain some valuable insights. If you wish to learn more such concepts, join Great Learning Academy’s Free Online Courses and upskill today.