I'm Mark Scully, a consultant specializing in Data Science. Since Data Science lacks an agreed upon definition, I'll give my definition as applying scientific principles and techniques from math and computer science to making use of data. I've worked on projects in Machine Learning, Data Mining, Predictive Analytics, Natural Language Processing, Computer Vision, Machine Vision, Bioinformatics, Neuroinformatics, Biomedical Imaging, and Data Fusion, among others. I've done everything from algorithm development to low level optimized implementations and big data infrastructure to web applications. I'm the Data Science equivalent of a full-stack web developer.
I most often work with python and C++ but have experience with High Performance Computing, High Throughput Computing, Hadoop, C, Postgres, and CouchDB. That said, I pick up new technologies fast, and am no stranger to R, Matlab, Hive, Pig, Mahout, Vowpal Wabbit, OpenCV, and many more. I recieved my Masters in Computer Science, focused on Machine Learning, from the University of New Mexico.
My approach is an applied one. Data Science and Machine Learning are most interesting when they are applied to other domains and many fields already have Data Science at their heart. I appreciate all the work done on pure theory and make use of it often, but I prefer to apply that knowledge to real world problems. When doing so I follow these principles:
Simpler methods are easier to understand, easier to maintain, and easier to explain.
There's always some constraint on how a problem is solved. It is impossible to optimize for everything.
Theory is an excellent guide and intuition is useful, but whenever possible I test everything using the data. I follow the scientific method: form a hypothesis, figure out how to test it, and change based on the results.