Smell-O-Vision

Background

This project has been a collaborative effort between myself and M Dougherty. I was incredibly excited to participate in their vision: to create an interactive, sensory experience with visuals and scents.

Planning

After some planning and feasibility experiments we decided to aim for an interactive environment that utilized camera based position detection to find the number of individuals in a space, the center point between them, and some other metrics discussed here. These datapoints would power a visualization using the P5 library and trigger scent diffusers around the room.

My primary role was to develop the machine learning integration, linking it with a live webcam and tuning it for performance in low light. M tackled hacking the diffusers so that they could be triggered from a single arduino. We both collaborated on the artistic direction for the visuals.

Execution

Visuals

My previous post discussed the basic setup of the Posenet machine learning model. As it turns out the most useful data points for our v1 were the number of individuals (poses), and their average X and Y coordinates in the room. The white circles in this photo represent individual poses; the purple circle is the center point for the group.

Top down view of individuals in a lobby. A purple blob is in the center of the photo and smaller white circles are on each individual.

Feeling pretty good about the model’s ability to handle stock footage, my next mission was to experiment to find possible visualizations that could use this data. Our initial attempts at visuals used the HSB colormode and had a pretty brutal color palette no matter the saturation. The results of my experimentation were maybe interesting but they weren’t much better I think.

Our initial, HSB visual for experimentation

These inky blobs came from modulating the hue

This one I affectionately refer to as "Loony Tune Brain Freeze"

Glassy artifact from setting colorMode per pixel 🙀

Our initial, HSB visual for experimentation thumbnail

These inky blobs came from modulating the hue thumbnail

This one I affectionately refer to as "Loony Tune Brain Freeze" thumbnail

Glassy artifact from setting colorMode per pixel 🙀 thumbnail

In the end, we knew that we wanted four different color scenes in each corner and that they should be at least vaguely red, blue, green and yellow. After a few days of experimenting, our solution was pretty simple. Whenever the groups central point crossed into a new zone, we draw the blobs in standard RGB color mode using only the channel that corresponds with that zone.

User testing with my lovely wife Raven

Performance

After settling on a visual aesthetic, the performance of our visualization (as measured with frame rate) was starting to chug and lag. I found that the most impactful improvement was significantly shrinking the P5 canvas size, then scaling it up in the browser with CSS. This ended up creating some weird misalignments between the generated Posenet positions and the newly distorted canvas. However, after correlating the video, canvas, and Posenet input resolutions, it effectively solved all of our animation lag problems.

Installation Issues

At this point, we felt comfortable beginning to move our work into our installation space (a conference room at the NYU ITP campus). When using a proper webcam instead of the my Macbook’s, we ran into quite a bit of trouble finding a place for the camera due to a change in aspect ratio. We wanted the lights to be as dim as possible to enhance the visibility of the projection, but this lower light caused Posenet to struggle. In addition, we had a hard time finding a position for the camera that could see the room in its entirety. As a result, we couldn’t get any blobs to appear.

Installation Solutions

After another series of video and canvas size tweaks, we got some poses, but they would disappear eratically due to the lighting. After some experimentation, our current solution is two-fold: adjust our code to permit poorer quality poses and using post processing software to increase the exposure. It’s a super fine line between under and over detection (more blobs than people), but I’m pretty happy with the balance we’ve tuned it to at the moment.

Finally, for the camera positioning, we simply couldn’t manage to find a position that included the four quadrants of the room; I wished out loud for one of those iPhone fish eye lenses so we could tape it to the webcam. A few minutes later – in a moment of pure magic – M comes into the room brandishing a tragically unused gift from their mom: an iPhone fisheye lens. To the great delight of M’s mother, it worked like a charm.

Me setting up the FishEye lens in the corner of the room

Takeaway

This project was a tremendous opportunity for me to experiment with a variety of topics from machine learning, to animation, performance, and visualization. Historically, I have a difficult time delegating and trusting partners. This project, with it’s size, scope and deadline, would have been impossible task alone. However, with our shared creative vision and work ethic, M and I were able to create something really special. Getting to experience the gestalt of a truly collaborative, creative endeavor has been an incredibly rewarding lesson I’ll carry with me.

For feedback, send me an email or tweet @vppiconeFeedback: email or tweet @vppicone

5 months ago | Creative Coding

Three.js – Week Two

Overview

This week I wanted to learn more about camera movement and effects in three.js. I also wanted to experiment with bringing in outside data into the animation. M and I are working on a project to visualize group dynamics in a small room, so I started with laying data on top of a video feed using regular canvas elements. The ”levers” we were looking for in order to control the visualization were person count, average group X and Y coordinates, and a proximity or unity factor describing how close the audience was in relation to one another.

After I was able to successfully get these paramaters from a video feed, I started experimenting what an output would look like using three.js.

Input

Sourcing the data from a video feed occurred in a series of measurements that build off of one another. Throughout the process, I drew directly onto a canvas with regular canvas drawing tools to ensure I was describing the data correctly.

Posenet

The first representation of the group comes the Poesnet model, using tensorflow.js. This required a fair amount of tweaking due to the overhead view.

Overhead view of people walking in a public space with data drawn on top

Center points

The next representation came from calculating the central point of each bounding box. We can also get a solid count on the number of in the feed at this stage.

Group average

Finding the group average would now allow us to approximate average group X and group Y coordinates. At this point it was apparent that data smoothing would be required to deal with individuals entering/exiting the scene as well as general uncertainty in the model negatively impacting the animation.

Final

For the final parameter, unity, a bounding box was drawn over the group‘s X and Y coordinates. Calculating the area of this bounding box allows us to approximate how close the individuals are to one another at any given moment.

We wanted to ensure that for the unity dimension, at a certain point of closeness (presumably 6 feet) there‘s no impact to prevent individuals standing closer than necessary.

Output

To experiment with the different parameters, I used mouse position to approximate group X and unity (measured on the Y axis). Clicking the mouse has the same effect as adding an additional person.

For feedback, send me an email or tweet @vppiconeFeedback: email or tweet @vppicone

5 months ago | Creative Coding

Three.js – Repetition

Assignment

Make a sketch with a lot of repetition, more than you want to hand-code. Can you make your sketch responsive to mouse behavior or key presses?

Result

Ideas & Process

I wanted to use this weeks project to experiment with react-three-fiber, a library that lets you build three.js projects declaratively with re-usable, self-contained components. Their documentation got me started with a pair of slowly spinning cubes. With a couple of loops, I was able to create a screen of cubes. This screen of cubes could then be used to render text.

The word “hi” spelled out in cubes

My lovely wife, Raven, had the idea to build something representing the embodying the quote oft misattributed to Einstein: "Insanity is doing the same thing over and over again and expecting different results."

This gave me the idea to create a game whereby the users actions often did nothing...until they suddenly did. I really liked the parallel between the users repeated actions and the repetition of the cubes.

Hurdles

Once the text has been revealed, I wanted to interpolate between the two states so the change from small to large cubes wasn’t so sudden. One of my favorite animation libraries react-spring, has support for three.js built in. After reacclimating to the library, I was able to get some really neat results.

I really wanted the pointed light to follow the users mouse but kept running into performance issues. I learned quickly that you never want to tie any animations to react state in three.js. You end up re-rendering way more than you need or intended causing everything to come to a crawl. This is discussed in depth in the react-three-fiber performance pitfalls documentation.

Takeaways

I learned a lot about three.js and performance. In the next assignment, I’d like to experiment more with moving the camera and loading external resources for objects.

For feedback, send me an email or tweet @vppiconeFeedback: email or tweet @vppicone