1025 West Navitus Dr, Appleton, WI 54913, United States of America
Internship
Onsite
Intern
U.S. AutoForce is seeking a Data Science & GenAI Research Intern to join their Insight Engineering team for Summer 2026. This internship focuses on applied research and development in data science and Generative AI, particularly in prototyping AI capabilities for enterprise data systems.
Prototype agents that answer business questions by reasoning over certified KPIs, semantic models, and operational data — not by guessing
Build retrieval-augmented pipelines over structured warehouse data and unstructured documents (regulatory content, internal documentation, ticket history)
Explore tool-use patterns where an LLM orchestrates SQL generation, validation against the semantic layer, and result interpretation — with governance and lineage preserved
Evaluate model behavior rigorously — accuracy, hallucination rates, latency, cost — and document trade-offs
Develop and test models against governed datasets in Snowflake — forecasting, classification, anomaly detection, or entity resolution depending on the active research question
Engineer features that could land in a future feature store, with attention to lineage, reproducibility, and certified inputs
Run experiments using Snowflake Cortex AI (native LLM and ML functions in SQL), Python notebooks, and other appropriate tooling
Write up findings in a way the engineering team, product partners, and SLT can consume — what was tried, what worked, what didn’t, and what to do next
Demo prototypes to the Insight Engineering team and contribute to the decision of whether a capability is ready to move from R&D into the platform roadmap
Qualification
Required
Pursuing a degree in Computer Science, Data Science, Statistics, Applied Math, Machine Learning, or a related quantitative field
Working proficiency in Python, including common data and ML libraries (pandas, scikit-learn, NumPy)
Basic SQL fluency — you can write queries, read execution plans casually, and understand joins, aggregations, and CTEs
Exposure to LLMs or GenAI tooling — coursework, side projects, or hands-on use of APIs (OpenAI, Anthropic, open-weight models, LangChain/LlamaIndex, or equivalents). You do not need production experience; you need genuine curiosity and a portfolio of attempts
Comfort with ambiguity. R&D problems are not specced like feature tickets. You will need to scope your own work, decide when an experiment is done, and communicate trade-offs clearly
Strong written communication. Findings the team can't read are findings the team can't use
Preferred
Hands-on experience with agentic frameworks (tool use, function calling, multi-step reasoning, ReAct-style loops)
Familiarity with retrieval-augmented generation (RAG), vector databases, embeddings, and evaluation methods for LLM outputs
Exposure to Snowflake, dbt, or modern data warehouse tooling — Cortex AI experience is a clear plus
Experience with Power BI or other BI/visualization tools