A Guide to Automated Assessment Content Development in STEM


Science, Technology, Engineering, and Math (STEM) learning is critical, as these disciplines permeate our world and make possible the interconnected, technology-based life we enjoy. Most countries have educational standards in school and higher education. Those standards address language and domestic pedagogical approaches, and they vary in complexity and scope. In United States school systems, Common Core standards offer the most advanced learning goal-oriented framework and taxonomy of learning objectives in which STEM figures prominently. While Common Core provides learning guidelines for U.S. schools, STEM also figures into all levels of education across the globe.



Effective STEM learning means developing and applying scientific and mathematical reasoning and problem-solving skills. According to the National Research Council, successful STEM students “have opportunities to learn science, mathematics, and engineering by addressing problems that have real-world applications.”

Therefore, successful STEM education relies on students solving STEM problems tied to the grade-level learning objectives specified in the Common Core standards or national curricula. So, how are STEM problems developed?



Today, STEM problem development is a fully manual effort. Educators are faced with the serious issue of finding content that fits their needs, their teaching style, and pedagogy. The search requires substantial time, effort, and potentially, cost.

Beyond these challenges, uneven problem quality, insufficient problem quantity, and an inability to scale industrially all limit the success potential of STEM education. The good news is, an automated approach to STEM problem development addresses all these issues, and this approach is within our grasp.



An automated, artificial intelligence (AI) approach to STEM problem development offers two key benefits that mirror advantages gained by consumers from high-quality automated production lines. The first benefit is financial. Once an AI-based machine is trained to generate STEM problems, it will create high-quality content more quickly, in greater quantity, and less expensively than a fully human expert-based approach. The second is pedagogical. Teaching academic and theoretical STEM concepts would gain additional relevant content, with more options available for self-paced learning and adaptive learning pathways. The net result will be better STEM learning outcomes for more students.



Automated STEM problem development would yield a rich, comprehensive problem set, and AI techniques make this possible. So, what kind of STEM problems could be developed?

Here are five classes of STEM problems that are amenable to automated development.


1 - Create new problems based on supplied parameters, including the learning objective, difficulty, and desired Bloom’s Taxonomy objective. Bloom’s Taxonomy characterizes higher learning in stages ranging from “Knowledge” of previously learned information through “Evaluation” or making and defending evidence-based judgements.

Example: Generate problems about “Subtraction of expressions with rational exponents” that are of “medium” difficulty. and that tests the student’s “Comprehension” as defined by Bloom.

2 – Tag textbook and other educational content as belonging to specific elements of Common Core, or another educational standard. Presently, tagging is a completely manual process. Automated tagging would identify existing problems in textbooks or other source materials as elucidating a learning objective for a given level of difficulty and Bloom’s Taxonomy classification. Once identified, the tagged content would provide a model for generating sets of similar problems.

Example: The tagging engine encounters the problem “Find 45% of 120”. It then analyzes the problem and identifies learning objective as “Compute Basic Percentages”. This tagged problem becomes a model for generating similar problems.

3 – Generate a set of numerical problems for specified learning objectives, difficulty and Bloom’s Taxonomy objective, based on a model.

Example: Take a model problem such as “Find 30% of 400” and, based on that problem, create a set of variations.

4 – Develop problem variations with the same learning objective and differing topics.

Example: For a slope calculation learning objective, an initial “CONSTRUCTION” problem might read “A roof rises 8.75 ft in a horizontal distance of 15.09 ft. Find the slope of the roof to the nearest hundredth.”

Using automation, take the same learning objective and change the verbiage and argument ranges to fit a “SCIENCE AND MEDICINE” topic such as “An airplane covered 15 mi of its route while decreasing its altitude by 24,000 ft. Find the slope of the line of descent that was followed”.

5 - Create step-by-step problems and their solutions. Begin with a problem specification that requires multiple steps with intermediate results to reach its solution. Using automation techniques, deconstruct the initial problem into its sub-steps and test for a satisfactory solution of each sub-step. In addition, provide additional problems for sub-steps that are not solved correctly.

Example: Solve the quadratic equation x2-2x-8 = 0.

  1. Factor the equation into two terms (x+2) and (x-4)
  2. Re-state the equation as (x+2) * (x-4) = 0
  3. Find the zero of each term:
    1. ((x+2) * (x-4)) / (x-4) = 0 / (x-4) -> (x + 2) = 0 -> x = -2
    2. ((x+2) * (x-4)) / (x+2) = 0 / (x+2) -> (x - 4) = 0 -> x = 4

Using AI techniques, a rich set of STEM problems would be available to educators and students alike. Automation may focus on textbooks from recognized content providers or on public-domain material. In the latter case, automation would create textbook-agnostic problem sets.

When AI solutions and robotics are mentioned, many are concerned that automation will rob workers of jobs. With STEM problems, this will not be the case. Machines cannot learn to develop STEM problems on their own. Instead of authoring problems, practitioners will be busy teaching machines.



Technology and challenge go hand in hand. These are the major challenges to automated STEM problem creation we face today.

Ingesting Content for Machine Training at Scale – Much STEM content exists, but it has been produced in human-readable rather than machine-sensible formats. Therefore, large-scale content ingestion may be time consuming and resource intensive.

Obtaining High-Quality Content – Content publishers seek to protect their content from use, and public-domain content may be poorly curated or moderated, and therefore may be difficult to trust. The adage of “Garbage In, Garbage Out” holds, and training machines with poor content will lead to poor outcomes.

Machine Training Time is Long and Costly – Training machines takes substantial expert time and significant elapsed time. As time is money, this challenge is one of intellectual and financial resource availability.

Creating a pedagogical taxonomy – Supervised training uses samples of various pedagogical content knowledge taxonomies. Over time, we will create a universal taxonomy that would cover the learning objectives of the major textbook publishers.

Developing Appropriate Datasets - For each topic in the taxonomy, we need an associated dataset. This ensures that STEM problems in dataset will match the parameters in the taxonomy topic.

Achieving a High Accuracy Rate – A high accuracy rate in STEM problem generation saves both expert training time and the time that experts would need to create problems manually. High error rates consume valuable resources and waste time.

These challenges and others have our focus and will ultimately yield to collegial partnering, ongoing development, and thorough testing.



AI approaches have been around for several decades, but we are at an inflection point for machine learning. The progress of hardware development has created truly high-performance computing nodes, which are necessary to successful AI pursuits. Publicly available open-source AI software is now available. AI work has evolved into a community-driven initiative with significant sharing and innovation. We have moved beyond what would be possible for a single individual or enterprise to create.

Taken all this together, the time for success in STEM problem development is at hand. Here are the major building blocks of our initiative.

Supervised Learning — Teaching machines is a type of machine learning (ML) known as supervised learning. Supervised learning uses repeated training by expert practitioners to improve the algorithms used to classify and develop output, which, in this case, is STEM problem sets.

Natural Language Processing — It takes Natural Language Processing (NLP) algorithms, another facet of AI and Machine Learning, to create meaningful text within the problems. NLP uses the terms and semantic concepts of STEM problems that reside in an ontology base we create. When building new problem sets, the appropriate ontology is automatically selected from those already in the ontology base, using exact matching or closest match.

Open Source Tools — An approach based on off-the-shelf tools that include: Tensorflow, Scikit-learn, NumPy, SciPy, and NLTK. You may learn more about these tools here. Software development processes integrate these tools. Python is used to work with libraries, and Machine Learning algorithms are implemented with C++  for improved server-side performance. Learn more about these languages as used in machine learning here.



At Competentum, we believe we are the optimal partner for making Automated STEM Problem Development a production reality. We have historical expertise in content and software development for education and partnerships with leading Data Science teams at top universities. This combination will lead to Automated STEM Problem Development success for our content partners and us. We cover all the steps in the process: supervised machine training by our subject matter experts (SMEs), development of content-generation software, and curation of final content by our SMEs. Our ongoing development and testing and engaged content partners will speed automated STEM problem development into production. We welcome content partners to join us in our journey to automation. Today, we are in the early stages of algorithm training, and we expect to be ready for production in 12 to 18 months. To learn more about joining our program, contact us.

Keep me updated

Keep me updated. Labels with * are required