Image Processing (March/April 2025)
The Dynamic World of Image Generation Models #
- Images = pixels in matrices and arrays, Words = broken down into tokens
- Computer vision, image processing, and language models build on a common foundation: deep learning
- OpenAI DALL-E sets the stage… DALL-E is a name that combines “Dali” (referencing the surrealist artist Salvador Dalí) and “WALL-E” (the Disney robot character)
- Stable Diffusion, Midjourney, Genini 2.5 (Google), OpenAI GPT
How Image Generation Models Work #
- Language model foundations (embeddings)
- Text caption/image pairs for training (matching of embeddings)
- GPT = generative pre-trained transformer
Prompts Make the Difference #
- Use simple, plain language, be concise, be explicit
- Name the objects, the setting, the style
- Revise, revise, revise
Learning by Doing, Learning by Programming #
- Hugging Face repository of machine learning and AI models
- Python programming packages pypi
- R programming packages CRAN
- Go programming packages go.dev
- AI-assisted programming
References #

-
Elgendy, Mohamed. 2020. Deep Learning for Vision Systems. Shelter Island, NY: Manning. [ISBN-13: 9781617296192]. Amazon Associates Paid Links: Paperback, Kindle.
-
Foster, David. 2023. Generative Deep Learning: Teaching Machines to Paint, Write, Compose, and Play (second edition). Sebastopol, CA: O’Reilly. [ISBN-13: 978-1098134181] Amazon Associates Paid Links: Paperback, Kindle.
-
Lane, Hobson and Maria Dyshel. 2025. Natural Language Processing in Action (second edition). [ISBN-13: 978-1617299445]. Amazon Associates Paid Link: Paperback.
-
Tunstall, Lewis, Leandro von Werra, and Thomas Wolf. 2022. Natural Language Processing with Transformers: Building Language Applications with Hugging Face (revised edition). Sebastopol, CA: O’Reilly. [ISBN-13: 978-1-098-13679-6] Amazon Associates Paid Links: Paperback, Kindle.
Back to the main page for the Seminars page.