- Home
- Remote Jobs
- Data Scientist - NLP
Data Scientist - NLP
Job summary
Work model
Analytica is seeking a Data Scientist to support long-term federal client engagements. This position is fully remote.
Analytica has been recognized by Inc. for 3 consecutive years as one of the 250 fastest-growing businesses. We offer competitive compensation with opportunities for bonuses, employer-paid health care, training and development funds, and 401k match.
Responsibilities
- Pre-processing: Demonstrate skills to collect, clean, and prepare data sets for input into a computational model using Python. Explain methods applied using common pre-processing functions such as stop word removal, stemming, lemmatization, and tokenization.
- Feature Engineering and Attribute Evaluation: Demonstrate experience with NLP feature engineering methods such as TF-IDF, word2vec, GloVe, and FastText. Identify key determinants for modeling and select evaluation protocols.
- Modeling: Practice skills selecting classification modeling techniques to fit the business problem, including machine learning (ML) supervised and unsupervised learning, regression, neural networks, deep learning, and natural language processing.
- Validation: Describe experience with investigating, reporting, and justifying model results.
- Visualization: Experience in presenting results of modeling activities, depicting insights, and explaining the relevance of results to business challenges.
Qualifications
- Master's degree required; PhD preferred in Statistics, Mathematics, Computer Science, or similar.
- High degree of experience utilizing SAS, R, or Python to support NLP use cases such as Document Summarization, Named Entity Recognition, Sentiment Analysis, and/or Topic Modeling.
- At least four years of experience developing scalable, production-ready NLP solutions using sci-kit learn, Keras, TensorFlow, PyTorch, Spark NLP.
- Experience using git/github for version control.
- Experience leveraging transformer architecture to develop NLP models.
- Experience with open source NLP packages such as Gensim, SpaCy, or NLTK.
- Experience with BERT, GPT-J, RoBERTa, T5 or other transformers.
- Experience with GenAI and Prompt Engineering is a plus.
- Experience in Databricks and MLFlow is a plus.
- Experience with machine translation and transcription of foreign language documents using Microsoft Azure translation services is a plus.
- Experience working in an AWS cloud environment and with related AWS services such as Bedrock and Textract.
- Experience coordinating and maintaining user stories.
- Must be a US citizen.
- Must be able to obtain and maintain a Public trust security clearance.
About ANALYTICA
Analytica is a leading consulting and information technology solutions provider to public sector organizations supporting health, civilian, and national security missions. Founded in 2009 and headquartered in Bethesda, MD, the company is an established SBA small business. Analytica specializes in software and systems engineering, information management, analytics & visualization, agile project management, and management consulting services. The company is appraised by the Software Engineering Institute (SEI) at CMMI® Maturity Level 3 and is an ISO 9001:2008 certified provider.