A much more realistic picture of the world – my experience as a summer intern at Atigeo
Culture, Sep 06, 2016
A much more realistic picture of the world – my experience as a summer intern at Atigeo
Big data are certainly buzzwords these days, but what exactly is it? In my time here at Atigeo—a compassionate technology company for a wiser planet that specializes in hybrid-analytics—as a summer intern ahead of my senior year @UofRedlands (#gobulldogs), I’ve learned a lot, however, I get the feeling I’ve just scratched the surface of big data analytics and understanding its potential.
Here’s a bit that I do know. Many companies claim they are doing “big data” when in fact most are simply doing business intelligence. Business intelligence is essentially the technology to acquire and organize raw data so it can be used for business analysis purposes. It’s not data analytics, however, instead it's presenting the data in a way that is easy to understand and view in order for humans to conduct the analysis. What sets Atigeo apart is its big data analytics platform, xPatterns, which analyzes huge data sets (structured or unstructured) and finds solutions through inference and prediction, with or without human involvement, by looking for patterns in the data. xPatterns can be applied to any business or challenge that leverages data. Their primary focus is healthcare and cybersecurity. In healthcare, this involves helping to guide critical decisions, advancing best practices, improving patient outcomes, and fueling research discoveries that eradicate diseases. In cybersecurity, Atigeo is passionate about protecting people and the organizations they entrust their data to, by identifying vulnerabilities and the presence of threats, and predicting and preventing attacks.
While here, I had the once in a lifetime opportunity to interview Dr. Wolf Kohn and Dr. Zelda Zabinsky. Kohn is Atigeo’s Chief Scientist and double PhD from MIT, and known as the father of hybrid control theory. That is, mathematical concepts that form the basis of hybrid-analytics for which I am still admittedly just beginning to understand. Zabinsky is a consultant, long-time friend, partner and co-author of papers, studies and books with Kohn, in addition to both being professors at the University of Washington (@UW). With more than thirty years of experience between them, they are of the foremost experts in the field. They were generous enough to sit down with me and answer some of my questions on big data and the applications it has for young people today.
What excites you about big data analytics and its ability to change the world?
Kohn: For me, the most important and exciting thing about big data is that they contain models of reality. And before we were able to process data, the models of reality had to be purely deductive. In other words, we had some principals and we would build a model. Now that we have the ability to convert data into models, we have a much more realistic picture of the world. The word we use is data tomography, i.e. the ability of extracting information to build a model of reality.
What are the next great opportunities that will be unleashed by big data in your opinion and how will this technology affect business?
Kohn: I believe that you are going to be able to infer more realistic marketing expectations and forecast the demand of new products from the data. Before you had to rely on a person to figure it out, and that might mean taking on a big risk.
Zabinsky: That is the marketing side. On the business side for example, you can figure out what transportation process and inventory you need, along with the kind of support needed to accommodate all of it.
So in the future you see big data being involved in almost all businesses?
Kohn: I believe, given what we have seen, anybody who doesn’t have big analytics capabilities will not be able to compete. Even for small businesses.
How long have you been in this field and how has the field evolved since you started?
Kohn: I have been working on this since we started to do hybrid systems about 15 years ago. Hybrid systems, which is a part of all this data revolution, is a set of theories and technologies for amalgamating data from heterogeneous sources to get better information.
Zabinsky: Yes, the computing capabilities has made it possible. Whereas 15, 20 even 30 years ago computers were no way near where they are today. So, it definitely enables a lot of the capability.
Wolf, they call you the father of hybrid control theory, what is this and when did you develop it?
Kohn: The military was trying to solve an interesting problem. They were trying to get very complicated war machines like the M1 tank, which has 26 or 27 processors, and all these analog devices, which they were controlling, to work together. The analog part of the system did not respond well to the digital part of the system because they were totally different. So we developed a control theory with a professor from Cornell, Professor Anil Nerode, that allows you to define the models of the system you are controlling with multiple representations. After we developed this for the military we realized that this is good for other things. In some cases, we have models that are defined by rules, and models that are defined by data. And there are models that are defined by mathematics that you want to be able to amalgamate together. That is a quick explanation of hybrid-analytics.
How are hybrid-analytics used by Atigeo?
Kohn: We have a tool we developed called CDI, which stands for cooperative distributive inferencing. It amalgamates algorithms for control with data, forecasting, and information with rules to achieve desired results.
Can you explain a little more what CDI is?
Kohn: CDI is a tool that allows data and information that is available from different sources, in different forms, to get a coordinated response. It does this with agents, which are software devices that are trained to understand particular aspects of data and they infer some partial answer. You then have to combine this partial answer, to the different agents, into a single answer, in order to get the best answer. An example is when you go to a heart specialist, a kidney specialist, and a specialist on blood diseases. All examine the patient and each has certain sets of rules and principals that they use. The idea is to get a coordinated diagnosis or a coordinated way of treating the patient.
What is something Atigeo is doing right now with CDI that excites you?
Kohn: I think the most interesting application of CDI that we are doing right now is to allow the use of micro grid centers for delivering electricity.
Zabinsky: Solar panels, energy grids, etc.
Why has big data analytics become so important in cybersecurity?
Kohn: Cybersecurity is basically extracting threats from the universe, the data, and the messages that are flowing around. Those things, because the internet is so powerful and the communication system is so gigantic, makes it quite difficult to monitor. So the point is to fish out threats from this big data dynamic flow. Cybersecurity is this scientific empirical collection of procedures for extracting threats from communication systems.
How does machine learning help with this?
Kohn: We are building models of the threats with machine learning, which characterizes the procedure that the hacker is using.
How about healthcare, why has big data analytics become so important?
Kohn: Healthcare is all about extracting information from sensory data about the patient. For example, with mammography we are trying to find the lesion on the breast before it becomes a cancer problem. The goal is early detection by leveraging the data. What is unique about our approach is that we are trying to train the system on the mammogram itself, and on each mammogram we have a localized model of the patient. This is an example of what they are beginning to call personalized medicine. And for personalized medicine, it is essential that you get the data that is available to construct a model of particular emphasis.
Zabinsky: Another reason data analytics is becoming so revolutionary is because we have the ability to get the data. Now we have sensors that can go into your bloodstream that can give you all this information, information that we could not get before. This level of data and information gathering, like some of the fit bits and stuff you can measure, and related sensors, is incredibly powerful and prolific, which makes data available to you as the individual.
What job opportunities do you see for young people who want a future in big data analytics?
Kohn: In my field of control theory, new principles that big data are affording for designing sophisticated control systems will create a totally new field. The applications of manufacturing that we are considering did not exist five years ago. And of course in business, business is going to be run totally different than it is right now, because it is all information dependent more than ever before.
Where are the best focus areas for college and graduate students to study to be relevant for the big data jobs market?
Kohn: Obviously study statistical analysis, and in computer science you can take big data as a subject. Study how to handle terabytes of data, store, manipulate, process and filter it. More areas of interest include inference and optimization. Inference and optimization is machine learning.
Why do you think there is a shortage of talent right now and what is the best way to fix this problem?
Kohn: I have read a lot of this and their different schools of thought. One is that several years ago the market for software engineers was very poor, so many people stopped going to engineering school for an MBA, law school, or medical school. The second reason is I believe is more significant. That is, the baby boomers are retiring in mass. So, there is not enough of the new generation, the millennials, to replace the workforce.
What inspires you listening to college students today?
Kohn: I personally like to elicit new ideas. Keeps me young.
Zabinsky: I just enjoy working with young people, college students, graduate students and being an educator at heart. It is nice to have the education but it’s also goes both ways. I learn a lot from the students while they learn from me. It is really exciting.
It is clear to me from my time here, including with Wolf and Zelda, that big data is an exciting field that factors into the future of virtually all industries. Now that technology has reached the point where we are able to collect all this raw data, with the right tools, applications and thinkers, it can be used to propel us into an optimal future. Seeing and understanding the xPatterns platform, and the leaders behind it, has made me a believer that the answer to the world’s most pressing challenges, whether that be finding the cure for cancer, protecting our infrastructure, or harnessing solar energy, can be solved by unleashing the potential of data.
To take advantage of the capabilities big data provides, it will be important for young people to understand it and learn the proper skills to further the work being done in the field. Almost everyone coming out of college worries about getting a good job. I know I, for one, do. And while I may just be getting started in my own understanding of big data, I am convinced that a career in big data, or at least having rudimentary knowledge of it, will yield not only great job opportunities for my generation, but a better future for us all.