Artificial intelligence (AI) is quickly becoming a game-changer for video producers. From automating mundane tasks to enhancing the creative process, AI is transforming the way we produce and consume video content.

In this article, we are going to discuss some key ways that AI will impact video producers. Without further ado, let's begin!

Object and Person Tracking

Object tracking involves following an object throughout different frames in a video. You can use object tracking to manipulate an existing video by adding annotations or you can use object tracking to direct the position of a camera. There are several ways that object tracking can be integrated into video production.

For example, an AI producer could use PTZ controls to create dynamic camera movements that follow a specific object or person within a shot. This can help to keep the viewer engaged and create a more immersive viewing experience.

One common use case is in sports broadcasting, where object tracking can be used to follow the movement of players on the field or court. This allows the camera to stay focused on the action, rather than having to constantly reframe the shot as the players move around.

Example: OBSBOT Tail

OBSBOT Tail has a built-in three-axis pan/tilt/zoom gimbal platform that effectively stabilizes an image during video shots. This team has a primary focus on building solutions for content creators and streamers.

Footage provided via OBSBOT Tail product page

Example: PTZOptics Move

With the PTZOptics Move you may pick any individual in a picture. The camera will automatically track and concentrate on the person selected.

The camera automatically tracks the object, maintaining focus on it at all times. Even when there are other individuals in the scene, as long as the subject is in the field of vision, the camera will keep them in focus and in the frame. This is especially helpful for educators that capture lectures on video or for creators producing live events.

For example, consider an event where presentations are given on a stage. By using a tool like this, the camera could stay focused on the presenter as they walk, rather than anyone else (i.e. the person who introduced them, a tech assistant who comes on stage to help with any issues).

Footage provided via PTZOptics YouTube

Content Aware Zoom

Additionally, AI producers can use Pan Tilt Zoom (PTZ) controls to automate the framing of a shot, ensuring that the subject remains in focus and properly composed within the frame.

This is especially useful in situations where the camera operator is unable to manually control the camera. This may happen when the camera is mounted on a moving vehicle or attached to a drone, a common scenario in broadcasting, news recording, and when taking action shots. Being aware of what content is important to capture and how to capture it, will bring AI production to a competitive level.

Example: RunwayML - Video Editing Software

There is research focused on building an AI that can automatically zoom in based on content rules but the closest commercial uses cases with digital zoom-in tools are products like Adobe Premier and RunwayML.

In these products we are able to establish timing rules for how fast the zoom should happen and can specify what the zoom should focus on. In the near future there will likely be a tool that allows the user to apply rules for what types of content the zoom should focus on. Thus, allowing the AI to control when and how the zoom is used in your content.

Bonus: ChatGPT or NLP Integration

The AI producer might be able to use NLP to develop a content rule set. "I want you to make a video where we zoom in on 'Bruce Lee', then 'Chuck Norris' and then a 'Cat' using the clips in my content library."

Bruce Lee, Chuck Norris, and Cat via

Automated Scene Priority

AI can assist with tasks such as color grading, object detection, and scene analysis to help video producers create more polished and professional-looking videos.

Example: Automated Scene Switching in OBS

Advanced Scene Switcher, a comprehensive plug-in with a ton of settings to aid producers in automating scene switching, provides switching automation capabilities to OBS, a tool used for various purposes in video and streaming.

The installation process for Advanced Scene Switcher is the same as for the majority of other OBS plug-ins, it may be downloaded on the official website. You will get one .zip file for this plug-in. You may see the auto-installers and files for Mac, PC, and Linux when you unpack it. Once installed you can create personalized logic for controlling scenes in the OBS environment.

A tool that integrates AI with pre-built logic and multiple cameras is able to fully control scene selection without real-time human intervention.

Jump cut / scene transition examples, courtesy of

Interactive and Customized Content

AI can be used to create personalized video experiences for viewers by analyzing preferences and interests based on the contents of videos. AI is already being used to create personalized video playlists or recommend related videos based on a viewer's history.[1]

We can take this concept a step further by allowing the AI to decide what live content you would find interesting during each stream. For example, if a lot of people who attend a stream are interested in "Let's Play" videos, the streamer could add more narrative to fit the audiences' preferences.

Enhancing the multi-streaming experience by allowing the AI producer to curate scene inputs for each individual viewer.

Example: The Jerma985 Dollhouse

Arguably one of the most impactful and creative live streaming events of 2022 is Jerma's Dollhouse. While there is no AI involved in this production, the concept requires a live audience which is engaged in the content production process.

Based on live user input, the scene or personalities of the actors will change. While Twitch Plays Pokémon explored the idea of user led content creation, Jerma is modernizing the idea through engaging with other streamers and producers who can improvise in the moment based on user feedback.

"The event took place between three separate livestreams, broadcast on Jerma's Twitch page.[2] The event was modeled after life simulation game franchise The Sims.[2][3] In the event, the stream viewers were given control over what Jerma does, through the ability to make decisions using a stream extension." - Wikipedia

This real-time feedback cycle between the viewers and actors can and will be automated in the future. While I don't think producers will ever be fully replaced, I do envision an AI being able to control scenes, operate cameras and provide metrics much faster based than a human ever could. Which will allow the content creators and producers to focus on building better content rather than the operational workload of the content.

The Jerma985 Dollhouse via The Dollhouse Stream Day 1

Real-time Scene Augmentation

Information from an object tracking tool can be used to add visual effects or animations that follow a specific object or person within a shot. This can be used to enhance the visual impact of a video, or to create more engaging and immersive viewing experiences.

For example, consider a scenario where you want to see who has possession of a ball in a game of football. An object tracking tool could keep track of the ball and highlight it with a specific color. The player closest to the ball could have a circle drawn under them to indicate that they have possession of the ball. This would be particularly useful in replays to make it clear to viewers what is happening in a scene.

This can be expanded into physical environments where the rule sets for augmentation can be built depending on real-time variables such as lighting, room shape, and camera resources. Through proper planning, an AI could track an actor and produce high quality visual effects that are mapped to the actor and environment in real-time.

Example: Notch - VMAs 2022 Augmented Reality

Trevor Burk of Visual Noise Creative and VMAs Supervising Producer and Creative Director Paul Caslin approached Hidden Road Studios earlier this year to design work for the 2022 VMAs, including broadcast design elements, house screen material, and, most importantly, the main show's augmented reality.

“We applied our VR video to the environment map within Notch, and our textures displayed real-time reflections of the actual arena and its lighting conditions. From there, we expanded on that process and used it for lighting objects. It was a huge step forward for us that could only have been achieved using Notch.” - Rowan Glenn, Creative Producer & Lead Designer, Hidden Road Studios
VMAs 2022 Album of the Year via Notch


Artificial intelligence (AI) is quickly becoming a game-changer for video production, with the potential to revolutionize the way we create and consume video content.

AI can help video producers automate mundane tasks, improve video editing and post-production, enhance viewer engagement, and increase efficiency and productivity. It can also be used to create personalized video experiences for viewers, using algorithms to analyze their preferences and interests.

AI can assist with tasks such as color grading and scene analysis to help people create more professional videos without as much manual work.

Overall, the integration of AI into video production is poised to have a significant impact on the content creation process, enabling video producers to create more sophisticated and engaging video content in less time.

To try out connecting AI to video production, see this tutorial on controlling OBS  Studio with hand gestures.