Intelligent Medical Objects (IMO) (Houston, TX) seeks a Sr. Data Scientist II to directly influence both our products and clients by being intricately involved in running analytical experiments in a methodical manner and will regularly evaluate alternate models via theoretical approaches.
Specific Duties Include:
Analyze and process textual data for bioinformatics applications using Melax CLAMP software kit and clinical NLP techniques
Design, customize, and extend existing Melax software suite and web-service applications according to Melax product needs and customer requirements.
Develop, maintain, and improve NLP applications that process unstructured biomedical texts into structured and searchable information
Modify and improve current Melax products by developing and incorporating the cutting-edge machine learning and deep learning algorithms and techniques for enhanced performance and usability
Communicate with customers, analyze their NLP needs and requirements, deliver products and projects, and provide assistance
Work within the NLP development team to develop NLP modules in different programming or scripting languages such as Java, JavaScript. J2EE, HTML
Conduct pre-processing and quality analyses for textual data inputs and performance validation for NLP output
Create systematic testing, error-checking procedures, and user manuals
Conduct customer consultation and technical support on NLP training, installation, development, and deployment
Share knowledge with team members and across the organization on topics including new and emerging NLP methods and technologies
Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources using SQL and AWS ‘big data’ technologies
Build analytics tools that utilize the data pipeline to provide actionable insights into customer acquisition, operational efficiency, and other key business performance metrics.
Must take and pass Python/Java coding test to solve one NLP algorithm problem.
Option to work remotely 60% of the time.
Position Requires:
Bachelor’s degree, or foreign equivalent, in Computer Information Systems, Informatics, or a closely related field of study, plus 5 years of experience in the job offered, or as an NLP Developer, NLP Data Engineer, NLP Data Scientist, Research Assistant, or a closely related NLP position.
Must have 5 years of experience in the following:
Developing NLP applications and building machine learning models
Developing ETL pipelines and processes in big data environments
Deploying, maintaining, versioning, and A/B testing machine learning models
Working in at least one of these databases: AWS Redshift, Oracle, SQL Server, or MySQL
Using SQL to write complex queries across large volumes of data
Developing and deploying full-stack solutions in Python
Using and following standardized development practices and tools, including TFS/GIT, code standards, and process standards
Writing unit tests using standard unit test frameworks
Working with statistical techniques, concepts, methods, and approaches, and working with their application
Using multivariate calculus and linear algebra.
Must have 3 years of experience with the following:
Working with AWS tools including Lambda and Sagemaker
Creating and using process documentation and workflows
Working with TensorFlow and TensorFlow Serving
Working with statistical modeling using R or Matlabl
Using big data frameworks such as Spark/pySpark
Working with Tableau, Looker, Qlikview, R Shiny or similar data visualization tools.
Must also have 2 years of experience using infrastructure-as-code tools, like terraform.
Must take and pass Python/Java coding test to solve one NLP algorithm problem.