Κυριακή 29 Δεκεμβρίου 2019

How much of a genius-level move was using binary space partitioning in Doom?

Cutting edge at the time, we swear.
Enlarge /

Cutting edge at the time, we swear.

In 1993, id Software released the first-person shooter Doom, which quickly became a phenomenon. The game is now considered one of the most influential games of all time.

A decade after Doom's release, in 2003, journalist David Kushner published a book about id Software called Masters of Doom, which has since become the canonical account of Doom's creation. I read Masters of Doom a few years ago and don't remember much of it now, but there was one story in the book about lead programmer John Carmack that has stuck with me. This is a loose gloss of the story (see below for the full details), but essentially, early in the development of Doom, Carmack realized that the 3D renderer he had written for the game slowed to a crawl when trying to render certain levels. This was unacceptable, because Doom was supposed to be action-packed and frenetic. So Carmack, realizing the problem with his renderer was fundamental enough that he would need to find a better rendering algorithm, starting reading research papers. He eventually implemented a technique called "binary space partitioning," never before used in a video game, that dramatically sped up the Doom engine.

That story about Carmack applying cutting-edge academic research to video games has always impressed me. It is my explanation for why Carmack has become such a legendary figure. He deserves to be known as the archetypal genius video game programmer for all sorts of reasons, but this episode with the academic papers and the binary space partitioning is the justification I think of first.

Obviously, the story is impressive because "binary space partitioning" sounds like it would be a difficult thing to just read about and implement yourself. I've long assumed that what Carmack did was a clever intellectual leap, but because I've never understood what binary space partitioning is or how novel a technique it was when Carmack decided to use it, I've never known for sure. On a spectrum from Homer Simpson to Albert Einstein, how much of a genius-level move was it really for Carmack to add binary space partitioning to Doom?

I've also wondered where binary space partitioning first came from and how the idea found its way to Carmack. So this post is about John Carmack and Doom, but it is also about the history of a data structure: the binary space partitioning tree (or BSP tree). It turns out that the BSP tree, rather interestingly, and like so many things in computer science, has its origins in research conducted for the military.

That's right: E1M1, the first level of Doom, was brought to you by the US Air Force.

The VSD problem

The BSP tree is a solution to one of the thorniest problems in computer graphics. In order to render a three-dimensional scene, a renderer has to figure out, given a particular viewpoint, what can be seen and what cannot be seen. This is not especially challenging if you have lots of time, but a respectable real-time game engine needs to figure out what can be seen and what cannot be seen at least 30 times a second.

This problem is sometimes called the problem of visible surface determination. Michael Abrash, a programmer who worked with Carmack on Quake (id Software's follow-up to Doom), wrote about the VSD problem in his famous Graphics Programming Black Book:

I want to talk about what is, in my opinion, the toughest 3-D problem of all: visible surface determination (drawing the proper surface at each pixel), and its close relative, culling (discarding non-visible polygons as quickly as possible, a way of accelerating visible surface determination). In the interests of brevity, I'll use the abbreviation VSD to mean both visible surface determination and culling from now on.

Why do I think VSD is the toughest 3-D challenge? Although rasterization issues such as texture mapping are fascinating and important, they are tasks of relatively finite scope, and are being moved into hardware as 3-D accelerators appear; also, they only scale with increases in screen resolution, which are relatively modest.

In contrast, VSD is an open-ended problem, and there are dozens of approaches currently in use. Even more significantly, the performance of VSD, done in an unsophisticated fashion, scales directly with scene complexity, which tends to increase as a square or cube function, so this very rapidly becomes the limiting factor in rendering realistic worlds.

Abrash was writing about the difficulty of the VSD problem in the late '90s, years after Doom had proved that regular people wanted to be able to play graphically intensive games on their home computers. In the early '90s, when id Software first began publishing games, the games had to be programmed to run efficiently on computers not designed to run them, computers meant for word processing, spreadsheet applications, and little else. To make this work, especially for the few 3D games that id Software published before Doom, id Software had to be creative. In these games, the design of all the levels was constrained in such a way that the VSD problem was easier to solve.

For example, in Wolfenstein 3D, the game id Software released just prior toDoom, every level is made from walls that are axis-aligned. In other words, in the Wolfenstein universe, you can have north-south walls or west-east walls, but nothing else. Walls can also only be placed at fixed intervals on a grid—all hallways are either one grid square wide, or two grid squares wide, etc., but never 2.5 grid squares wide. Though this meant that the id Software team could only design levels that all looked somewhat the same, it made Carmack's job of writing a renderer for Wolfenstein much simpler.

The

Wolfenstein

renderer solved the VSD problem by "marching" rays into the virtual world from the screen. Usually a renderer that uses rays is a "raycasting" renderer—these renderers are often slow, because solving the VSD problem in a raycaster involves finding the first intersection between a ray and something in your world, which in the general case requires lots of number crunching. But in

Wolfenstein

, because all the walls are aligned with the grid, the only location a ray can possibly intersect a wall is at the grid lines. So all the renderer needs to do is check each of those intersection points. If the renderer starts by checking the intersection point nearest to the player's viewpoint, then checks the next nearest, and so on, and stops when it encounters the first wall, the VSD problem has been solved in an almost trivial way. A ray is just marched forward from each pixel until it hits something, which works because the marching is so cheap in terms of CPU cycles. And actually, since all walls are the same height, it is only necessary to march a single ray for every

column

of pixels.

This rendering shortcut made Wolfenstein fast enough to run on underpowered home PCs in the era before dedicated graphics cards. But this approach would not work for Doom, since the id team had decided that their new game would feature novel things like diagonal walls, stairs, and ceilings of different heights. Ray marching was no longer viable, so Carmack wrote a different kind of renderer. Whereas the Wolfenstein renderer, with its ray for every column of pixels, is an "image-first" renderer, the Doom renderer is an "object-first" renderer. This means that rather than iterating through the pixels on screen and figuring out what color they should be, the Doom renderer iterates through the objects in a scene and projects each onto the screen in turn.

In an object-first renderer, one easy way to solve the VSD problem is to use a z-buffer. Each time you project an object onto the screen, for each pixel you want to draw to, you do a check. If the part of the object you want to draw is closer to the player than what was already drawn to the pixel, then you can overwrite what is there. Otherwise you have to leave the pixel as is. This approach is simple, but a z-buffer requires a lot of memory, and the renderer may still expend a lot of CPU cycles projecting level geometry that is never going to be seen by the player.

In the early 1990s, there was an additional drawback to the z-buffer approach: On IBM-compatible PCs, which used a video adapter system called VGA, writing to the output frame buffer was an expensive operation. So time spent drawing pixels that would only get overwritten later tanked the performance of your renderer.

Since writing to the frame buffer was so expensive, the ideal renderer was one that started by drawing the objects closest to the player, then the objects just beyond those objects, and so on, until every pixel on screen had been written to. At that point the renderer would know to stop, saving all the time it might have spent considering far-away objects that the player cannot see. But ordering the objects in a scene this way, from closest to farthest, is tantamount to solving the VSD problem. Once again, the question is: What can be seen by the player?

VIDEO

Initially, Carmack tried to solve this problem by relying on the layout of Doom's levels. His renderer started by drawing the walls of the room currently occupied by the player, then flooded out into neighboring rooms to draw the walls in those rooms that could be seen from the current room. Provided that every room was convex, this solved the VSD issue. Rooms that were not convex could be split into convex "sectors." You can see how this rendering technique might have looked if run at extra-slow speed in the video above, where YouTuber Bisqwit demonstrates a renderer of his own that works according to the same general algorithm. This algorithm was successfully used in Duke Nukem 3D, released three years after Doom, when CPUs were more powerful. But, in 1993, running on the hardware then available, the Doom renderer that used this algorithm struggled with complicated levels—particularly when sectors were nested inside of each other, which was the only way to create something like a circular pit of stairs. A circular pit of stairs led to lots of repeated recursive descents into a sector that had already been drawn, strangling the game engine's speed.

Around the time that the id team realized that the Doom game engine might be too slow, id Software was asked to port Wolfenstein 3D to the Super Nintendo. The Super Nintendo was even less powerful than the IBM-compatible PCs of the day, and it turned out that the ray-marching Wolfenstein renderer, simple as it was, didn't run fast enough on the Super Nintendo hardware. So Carmack began looking for a better algorithm. It was actually for the Super Nintendo port of Wolfenstein that Carmack first researched and implemented binary space partitioning. In Wolfenstein, this was relatively straightforward because all the walls were axis-aligned; in Doom, it would be more complex. But Carmack realized that BSP trees would solve Doom's speed problems too.


via Ars Technica https://arstechnica.com

Δευτέρα 16 Δεκεμβρίου 2019

Hey, Ubuntu? You Need a Better Image Viewer…

I think that Ubuntu needs a better default image viewer — and in this post I'm going to try and explain why!

Eye of GNOME is no where near as features as the image viewers on other platforms, including Chrome OS!

Now, don't get me wrong: 'Eye of GNOME' (which is often referred to by the package name of 'eog') does its job well. It lets you view images stored on your computer without any fuss.

But therein lies the rub; eog can't do much more than that. The app is simply no where near as featured as the default image viewers being shipped on other platforms, including Android, and even Chrome OS!

It's for this reason that I made changing the image viewer a step in my list of things to do after installing Ubuntu 19.10.

The Job of an Image Viewer

Admittedly it's been a long time since I last dove in to the world of open source image viewers (props for anyone who remembers Viewnior, an app I blogged endlessly about circa 2010. Here's hoping it gets a GTK3 port one day).

Yet, after every Ubuntu install I still do the exact same thing: make Shotwell the default image viewer for all supported image formats, including .jpeg and .png.

Why? Because Shotwell (as an image viewer) has a tonne of features that I use often, and it puts them in a really accessible place.

Now, you might be sat there thinking that I simply expect more from an image viewer than a regular user does.

But I'd disagree.

Features found across platforms

The default image viewing apps on both Windows and macOS let folks do far more with an image than simply view it. They include options to resize and crop, add text and callouts, and even perform some basic image enhancement.

Preview in macOS 10.12.6 with markup enabled

Folks switching to Linux from those systems may expect a comparable set of features in the native image viewer, only to find eog lacking.

If Ubuntu users would appreciate having some of those capabilities in easy reach too, and since Shotwell provides them, ought it be default instead?

Modern Expectations?

Now GNOME developers would, one imagines, reason that, as an image viewer, EOG should focus on viewing images and leave image editing to image editors, organisation to photo managers, and so on.

But while that explanation is fairly reasonable I do feel it overlooks the core reality of why most people use an image viewer today.

And spoiler: it isn't just to gawp at photos!

Viewing images is step 1, anticipate step 2

Thanks to smartphones, social networks, and ephemeral messaging services we send and receive more images than ever before. From gifs and selfies, to screenshots and wallpapers.

And, like many, I tend to view an image as the first step in a longer chain, usually to check that the photo in question is the one I'm looking to share or send or post or whatever else I want to do with it.

As part of that flow I usually make some basic edits, like cropping and resizing /converting the image to a lossy format.

Shotwell caters to all of that, within the same app, and in the same window. I don't need to load my image in an external app to make edits (then save the image, then open the image in the imagine viewer again to check it's the edited copy).

On screen controls

Having essential editing features available in an image viewer saves me time. Do they need to be on screen all the time (like they are in Shotwell)? Probably not.

Which brings me back to eog.

Now, I'm not advocating that eog transition to a full-fledged photo management app, but I do think that some thought should be given towards modern expectations and needs.

For instance, when I open an image eog I get four on-screen button: prev/next image and rotate left/rotate right:

Unless there's been a sudden uptick in the sale of digital cameras from the 1990s, why does rotating deserve omnipresent controls on every image?

I rarely need to rotate an image, certainly no where near enough to need on-screen controls plastered over every photo I view.

Eye of GNOME also lacks a couple of basic image editing features that the Shotwell image viewer natively provides, like image cropping and ratio resizing.

That said, Shotwell isn't flawless either. It has its flaws, as this chart shows:

Feature Eye of GNOME Shotwell (Image Viewer)
Play animated .gifs
Zoom
Resize image
Image cropping
Image rotation
Format conversation
Adjust image quality
Editing tools
Set image as wallpaper
Slideshow option
Show EXIF/info
Show transparent images

Tl;dr

We all use images way more than we used to. Ubuntu should ship with a modern image viewer, like Shotwell, to anticipate and cater to those needs.

What's your take? Let me know in the comments


via OMG! Ubuntu! https://ift.tt/2hvL2Dj

11 Best Cheap Laptops We Actually Like Using ($300 - $800)


If your budget is tight and you want the most bang for your buck, or you just want to keep something out of the landfill, the used laptop market it worth considering. I'm not going to link to or endorse any specific vendors, but I've had great luck buying used laptops on eBay from all sort of sellers (both pro and regular people).

To score the best deal make sure you know the market. Do some research first to figure out a machine that suits your needs. The easiest to come by, and therefore (usually) the best deals tend to be on more boring, business-oriented models. I happen to like ThinkPads, which tend to used by and then dumped all at once by large corporations, which means there's lots to choose from and they're cheap.

Aim for These Specs: Try to get a laptop with at least a 8th-generation Intel Core i3 processor, 8 GB of RAM, 128 GB of storage (preferably SSD or a solid-state drive), and at least a 13-inch display that's close to HD.

Finding Laptops on eBay: Once you know what you want, search for it on eBay. Scroll down and check the option to only show "Sold listings." Now take the 10 most recent sales, add up the prices and divide by 10. That's the average price, don't pay more than that. Keep the lowest price in mind, that's the great deal price. Now, uncheck the sold listing option. See what's between the lowest price and that average price. Those are the deals you can consider. I suggest watching a few. Don't bid, or participate at all. Just watch them until end, see how high the auctions ended up going.

Once you have a feel for the market, and what you should be paying, you'll know when you've found a deal. When you find it, wait. Don't bid until the last few minutes of the auction. You don't want other bidders to have a chance to react. Remember that if you miss out on something it's not the end of the world. There's always something new being listed on eBay.


via Wired Top Stories https://ift.tt/2uc60ci

I created my own deepfake—it took two weeks and cost $552


Deepfake technology uses deep neural networks to convincingly replace one face with another in a video. The technology has obvious potential for abuse and is becoming ever more widely accessible. Many good articles have been written about the important social and political implications of this trend.

This isn't one of those articles. Instead, in classic Ars Technica fashion, I'm going to take a close look at the technology itself: how does deepfake software work? How hard is it to use—and how good are the results?

I thought the best way to answer these questions would be to create a deepfake of my own. My Ars overlords gave me a few days to play around with deepfake software and a $1,000 cloud computing budget. A couple of weeks later, I have my result, which you can see above. I started with a video of Mark Zuckerberg testifying before Congress and replaced his face with that of Lieutenant Commander Data (Brent Spiner) from Star Trek: The Next Generation. Total spent: $552.

The video isn't perfect. It doesn't quite capture the full details of Data's face, and if you look closely you can see some artifacts around the edges.

Still, what's remarkable is that a neophyte like me can create fairly convincing video so quickly and for so little money. And there's every reason to think deepfake technology will continue to get better, faster, and cheaper in the coming years.

In this article I'll take you with me on my deepfake journey. I'll explain each step required to create a deepfake video. Along the way, I'll explain how the underlying technology works and explore some of its limitations.

Deepfakes need a lot of computing power and data

We call them deepfakes because they use deep neural networks. Over the last decade, computer scientists have discovered that neural networks become more and more powerful as you add additional layers of neurons (see the first installment of this series for a general introduction to neural networks). But to unlock the full power of these deeper networks, you need a lot of data and a whole lot of computing power.

That's certainly true of deepfakes. For this project, I rented a virtual machine with four beefy graphics cards. Even with all that horsepower, it took almost a week to train my deepfake model.

I also needed a heap of images of both Mark Zuckerberg and Mr. Data. My final video above is only 38 seconds long, but I needed to gather a lot more footage—of both Zuckberg and Data—for training.

To do this, I downloaded a bunch of videos containing their faces: 14 videos with clips from Star Trek: The Next Generation and nine videos featuring Mark Zuckerberg. My Zuckerberg videos included formal speeches, a couple of television interviews, and even footage of Zuckerberg smoking meat in his backyard.

I loaded all of these clips into iMovie and deleted sections that didn't contain Zuckerberg or Data's face. I also cut down longer sequences. Deepfake software doesn't just need a huge number of images, but it needs a huge number of different images. It needs to see a face from different angles, with different expressions, and in different lighting conditions. An hour-long video of Mark Zuckerberg giving a speech may not provide much more value than a five-minute segment of the same speech, because it just shows the same angles, lighting conditions, and expressions over and over again. So I trimmed several hours of footage down to 9 minutes of Data and 7 minutes of Zuckerberg.


via Ars Technica https://arstechnica.com