
Artificial Intelligence is a fast-moving technology. In just a year and a half it has mushroomed from Chat GPT and open.ai to being pervasive everywhere — being introduced into every software product.
Here is an Ignorant Guide:
1. AI Concepts
a. Machine-Learning Algorithms
Machine learning algorithms are software code snippets that explore, analyze, and find meaning in complex datasets. A machine can follow the algorithm to achieve a specific goal, such as translating text to another language.
As you are programming with AI, you want to choose an algorithm family that best suits your task, and then evaluate the various algorithms within the family to find the appropriate fit for your software. Microsoft provides a learning page on Machine-Learning Algorithms.
b. Machine Learning
Machine learning is an AI technique that uses algorithms to create predictive models. Machine-learning algorithms parse data fields and “learn” from the patterns within the data to generate models. The models can then make informed predictions or decisions based on new data.
The predictive models are validated against known data, and measured by performance metrics for specific business scenarios — then adjusted as necessary.
This process — of learning and validation — is called training. Through regular retraining, machine learning models improve over time.
In your software design, you might use machine learning if your scenario includes past observations that you can reliably use to predict future situations.
c. Deep Learning
Deep learning is a type of machine learning that learns through its own data processing. It also uses algorithms to analyze data — but it analyzes data through artificial neural networks that contain many inputs, outputs, and layers of processing. Each layer processes the data in a different way, and the output of one layer becomes the input for the next. In so doing, deep learning creates more complex models than traditional machine learning.
Deep learning requires a large investment to generate highly customized, exploratory models.
d. Generative AI
Generative AI trains models to generate original content. This is the AI most people have become familiar with already — asking an AI engine to create a picture or a video or text, based on written prompts that you give it, sometimes accompanied by pictures or links to websites.
Some generative AI applications that Microsoft provides are:
- Copilot: integrated into a wide range of Microsoft applications. Helps you write code, documents, and other text-based content.
- Azure OpenAI: development platform as a service that provides access to OpenAI’s powerful language models, such as o1-preview, o1-mini, GPT-4o, GPT-4o mini, GPT-4 Turbo with Vision, GPT-4, GPT-3.5-Turbo, and the Embeddings model series. You can adapt these models to do specific tasks:
- Content generation,
- Content summarization,
- Image understanding,
- Semantic search and
- Natural language to code translation.
e. Language Models
Language models are a subset of generative AI that focus on natural language processing tasks. Language models represent natural language based on the probability of words or sequences of words occurring in a given context.
- Conventional language models are used in supervised settings for research purposes where the models are trained on well-labeled text datasets for specific tasks.
- Pretrained language models are trained on large-scale text collections from the internet via deep learning neural networks. They enable you to easily get started with AI, and have become more widely used in recent years. You can fine-tune these models on smaller datasets for specific tasks.
Parameters
The size of a language model is determined by the number of parameters — or weights — it has. Parameters govern how the model processes input data and generates output. The model adjusts the weights during training to minimize the difference between its predictions and the actual data. In so doing, the model learns parameters. The more parameters a model has, the more complex & expressive it is, and also how computationally expensive it is to train and use.
Tradeoffs
LLM’s are smart, but only work with what they are trained on. An LLM knowledge is fixed after training, and when it lacks relevant information, it may generate plausible-sounding but incorrect answers (“hallucinations”).
f. Copilots
Copilots are generative AI assistants that integrate into software applications, usually as chat interfaces — providing contextualized support for common tasks in those applications.
For example, Microsoft Copilot integrates with a wide range of Microsoft applications. It is based on an open architecture where non-Microsoft developers can create their own plug-ins to extend the user experience and even create their own copilots by using the same open architecture.
g. Retrieval Augmented Generation
Retrieval Augmented Generation (RAG) is an architecture pattern that augments the capabilities of a large language model (LLM) — like ChatGPT — which is trained only on public data. You can use RAG to add a retrieval system that searches a specific set of information sources to provide relevant grounding data in the context with the user request.
RAG architecture helps you scope generative AI to content sourced from vectorized documents, images, and other data formats. RAG isn’t limited to vector search storage — you can use any data store technology.
Tradeoffs
RAG is smart and can “look things up” in real time. RAG retrieves specific, relevant information from an external source and combines it with the LLM’s reasoning to give more accurate, up-to-date answers. Still, if retrieval is poor or ignored, hallucinations can happen.
2. Agent-Based Architecture
An AI agent is a software system that uses artificial intelligence to perceive its environment, plan actions, and achieve goals, often with a degree of autonomy.
These agents can be as simple as rule-based systems or as complex as learning agents that adapt their behavior over time. They can be used for a wide range of tasks, from automating customer service interactions to designing websites.
Latest and Greatest
Agents are the latest, greatest technique for AI. They are smart and can “decide what to do next” to reach a goal. An agent reasons through problems, chooses and uses the right tools (such as web search or APIs), and can optionally maintain memory of its steps. This enables an agent to handle complex tasks that require multiple actions.
Multiple Agents
As architects and developers design their software to take full advantage of language model capabilities, AI agent systems become increasingly complex — often exceeding the abilities of a single agent that has access to many tools and knowledge sources. Instead, these systems are designed to use multi-agent orchestrations to handle complex, collaborative tasks reliably.
Examples
3. AI Services
Microsoft provides Azure AI services — which enable developers to use ready-made, prebuilt, and customizable APIs and models to create market-ready applications. Use cases include natural language processing for conversations, translation, speech, vision, search, monitoring, and decision-making.
4. AI Language Models
There are different types of AI Language models:
Large Language Models (LLMs): LLMs, such as the OpenAI GPT models, can generate natural language across various tasks. To choose a model, consider factors such as bias, accuracy, data privacy, and ethical use.
Small Language Models: Small, less compute-intensive language models are available for generative AI solutions. Microsoft, for example, offers the Phi-3 series of Small Language Models. SMLs can be more efficient, interpretable, and explainable than an LLM.
When developers build AI into their application, they can use language models as a hosted solution behind a metered API. Developers can instead use a small language model in-process or at least on the same compute as the consumer.
NEXT
All of the above is theory.
Next stop to understanding how to build AI products is knowledge of the actual development tools and platforms that you can use. We’ll cover that in another article — which covers:
- Development Tools & Platforms
- Data Platforms for AI
- Data Storage for AI
- Data Processing for AI
- Data Connectors for AI
To read all about it — click here.
Be the first to comment