Data science is an umbrella term that encompasses data analytics, data mining, machine learning, and several other related disciplines. While a data scientist is expected to forecast the future based on past patterns, data analysts extract meaningful insights from various data sources.
Data science is a "concept to unify statistics, data analysis, machine learning and their related methods" in order to "understand and analyze actual phenomena" with data. It employs techniques and theories drawn from many fields within the context of mathematics, statistics, computer science, and information science. Turing award winner Jim Gray imagined data science as a "fourth paradigm" of science (empirical, theoretical, computational and now data-driven) and asserted that "everything about science is changing because of the impact of information technology" and the data deluge. In 2015, the American Statistical Association identified database management, statistics, machine learning, and distributed and parallel systems as the emerging foundational professions
Areas of Expertise
Systems with inherent complexity arise in business processes, social and economic interactions, and biological activity. We use mathematical methods to decompose such systems into sets of interacting components and simulate their activity. Such models can then be simulated to obtain quantitative predictions as well as identify the emergence of unexpected behavior.
Fluctuations and randomness pepper almost every observable behavior, hiding patterns, and correlations. We use methods from statistics, probability, statistical physics, information theory, and econometrics to analyze a business as well as scientific problems and develop algorithms that predict outcomes.
While modeling and prediction allow us to understand system behavior, knowing the optimal response to this discovered behavior is essential for effective analytics. We leverage the mathematics and science of Operations Research and allied fields to formulate and solve optimization problems that inform decision-making.
Scientific computing involves the use of advanced computing capabilities to solve complex problems in physical, biological and social science. We use Monte Carlo and molecular dynamics simulations, graph theoretic and network science techniques, dynamical systems analysis, as well as image and signal processing.
We encapsulate our solutions into deployable applications either on cloud platforms or on the premise of the client. We design and develop systems to integrate and store data, perform computations and display information either through the web or desktop applications.
Communication, Data Visualization, and User Experience
No amount of analysis achieves the desired result if it cannot be immediately, accurately and convincingly communicated among stakeholders with minimum confusion. We employ principles of written and graphical communication as well as interface and interaction design to streamline the comprehension of our reports.