Blog

DAG vs. Tree

I am wondering whether the data structure “tree” might be a giant mistake in the history of computer science. Almost everything that can be saved in a tree is really a DAG.

Some background: A tree is a data structure consisting of nodes, where each node has one or no parent (this is technically a forest; a tree has only one node with no parent). Each node has children; specifically the nodes that have that node as its parent. Circles are forbidden.

The best-known example for this are computer file systems. Each file is in a folder. That folder is in another folder and so on, until you reach the drive, which forms the root of the tree. Mail lists and some kinds of forums and comment systems also work like that: Each post is a reply to exactly one previous post (at most) and gets shown a layer below. However, tree structures predate computer science. The whole field of Taxonomy as well as linguistics, both with their habit of grouping things in families and sub-families and so on, do exactly the same thing.

It’s easy to process trees automatically, for example to find all children. You can also identify each child uniquely through a given path.

There is just one problem with them: Reality does not work like that. In most real-world cases things, can have multiple parents. A classic example: What folder does a file fit in? Likewise, in discussions, sometimes you want to reply to several posts that make similar points at once. In linguistics, languages don’t really belong to one family exclusively, they have influence from lost of different ones.

To get around that, one either has to pick a parent explicitly, or split the object. Splitting works for discussions well, for files sometimes, for e.g. classification of car types not at all. In either case, information about the semantic relationships is lost.

The answer to that is called DAG, for Directed Acyclic Graph. The rules are almost the same as a tree, only a node can have more than one parent as well. An example: Media library apps like iTunes allow a song to appear in several playlists at once.

DAGs represent the examples here naturally. A file can be in two folders, for example. Since a tree is a special case of a DAG, it’s also easily possible to represent the cases that actually are tree-like.

In reality many systems already include hacks to create DAG structures. For example, thanks to Hard links, most file systems actually are DAGs, they just don’t show it that often. Symlinks, Aliases and Windows shortcut files all create a “second-class parent”, which explicitly stores the removed relationships.

Another approach is through tag systems, where a file, blog post or similar can have an arbitrary number of tags. This structure is actually just a two-level DAG. That may seem like a solution, but since it isn’t possible to form relationships between tags, and because many use cases of trees can’t be modeled that way (for example discussions), it isn’t a full replacement.

Admittedly, from a technical point of view, DAGs are much harder. For example: A tree is always planar, which means you can easily show it on a screen without crossing lines, something that is not guaranteed for DAGs. Enumerating all elements without repeating any is a lot harder. Objects can be reachable through two completely distinct paths - which is of course the goal, but it means you have to look at the graph to answer the question “do these two paths end at the same object”, which is trivial with trees.

Most importantly, I know no really good UI to show and manipulate DAGs. For example, the opposite to a tree structure in forums is not a DAG, it’s a simple list ordered by time, and the relationships between posts have to be shown manually in the text.

(Of course, this does not affect the use of trees in cases where end users don’t see them, for example search trees, heaps and so on.)

What do you think? Due to too much spam, I’ve disabled the comments here for now, but you can reach me on Twitter via @zcochrane.

30 years of Mac

When I bought my first Mac in 2002 (an eMac with 700 MHz G4), I considered myself a late-comer to the party. Today, in most groups of Mac users, I am among the one who has been using one the longest.

Unlike the many people who are now posting about how they saw their first Mac and were instantly hooked, I sort of slid into the whole thing. The first Mac I ever saw was a PowerMac G4, still with pinstripes, owned by my father for layout work. It ran Mac OS 9, which was certainly different but didn’t seem in any way interesting. That changed considerably with the second Mac I ever saw: An iBook G3, the white edition, running OS X 10.1. It was rather slow, there was barely any software, but I really liked it. I’d like to say that this was because I saw the great vision and the things to come, but in reality, pretty graphics were probably the real factor. A big part of that were the free development tools: Project Builder and Interface Builder, these days both known as Xcode. Finally, I wanted iTunes and an iPod. Both were really, really impressive to someone who was used to the horrors of Winamp and burning CDs for portable CD players.

It’s been an odd ride since then. When I started, the transition from OS 9 to OS X was still in full swing. Then Apple switched to Intel, then iOS (or rather the iPhone OS as it was known back then) became a huge thing. Apple rumors used to be an important and often accurate source of information, but that scene has almost completely died. Perhaps because all the rumors that ran forever (Apple will switch to Intel! Apple will release a successor to the Newton! Apple is building a phone! Apple will make a mouse with a second button! Apple will make a tablet!) have ultimately proven true, but usually not in the way anyone expected.

Important for me: I learned basically everything I know about programming and software development on my various Macs. It’s still my biggest hobby, and its how I make money these days, and while I have no idea whether I’d be elsewhere without a Mac, I know that without it, it would have been a lot less fun.

Agents of Shield - Checklist filled

So let’s see how that checklist works out in practice then. Oh yeah, spoilers:

  • A group of people that don’t get along initially, but grow to become a family? - Yeah. Still working on that growing to be a family.
  • A scene where it looks like a tiny girl is in danger, but in reality she’s the one in control of the whole situation and the most dangerous thing all around? - Yeah. This time she’s evil. In a wider sense, the villain of the week is male but portrays that power reversal, too.
  • A “oner”, meaning a very long shot where the camera moves to a space without cutting? - Not yet.
  • An episode where information is conveyed through music? - Not yet.
  • Beloved minor characters who die? - Not yet.
  • Characters in so much pain and angst that they can barely walk? - Yeah. Charles Gunn.
  • Pop culture references? - Oh yeah. “With great power comes… a whole lot of shit that you’re not prepared to deal with.”
  • Actors from previous Joss Whedon works? - Charles Gunn, Shepherd Book, the Serenity. And a discount version of Eliza Dushku.
  • A great sense of humor? - Hell yeah. “Sorry, the corner was dark, I couldn’t resist”
  • Horrible things happening to people in love? - still too early for that

Agents of S.H.I.E.L.D Checklist

Tomorrow, Agents of Shield1 will start. It’s a TV show set in the Avengers movie universe, focusing on somewhat more minor heroes. The promotional material has focused on showing action sequences and explosions. The images were of various people all looking strong and grim. In many ways, it sounds like you could skip it. Except for one thing: It’s the new TV series by Joss Whedon, the genius behind Buffy the Vampire Slayer, Angel, Firefly, Dr. Horrible’s Sing-Along Blog, Dollhouse and, oh yeah, the Avengers movie.

So there’s a question: Is this a generic action thing, or is this the next in the series of awesome things by Joss? To make up my mind, I’ve made a standard Whedon checklist that I’m going to fill as soon as the pilot is available on iTunes. Beware: This includes spoilers for all the shows mentioned above.

Does Agents of Shield have…?

  • A group of people that don’t get along initially, but grow to become a family? (as seen in Buffy, Angel, Firefly, Dollhouse to a lesser extent, Avengers)
  • A scene where it looks like a tiny girl is in danger, but in reality she’s the one in control of the whole situation and the most dangerous thing all around? (the mission statement of Buffy, but also very notable in Avengers where Natasha pulls it off twice)
  • A “oner”, meaning a very long shot where the camera moves to a space without cutting? (as seen in Buffy, Angel, Dollhouse, Firefly… basically everywhere. Avengers has its prime example in the middle of the fight in New York)
  • An episode where information is conveyed through music? (the musical episode in Buffy and Dr. Horrible’s Sing-Along-Blog are the most obvious examples, but don’t forget “The Hero of Canton” from Firefly)
  • Beloved minor characters who die? (Buffy does it a lot. Angel does it a lot. Firefly does it surprisingly often. Dollhouse does it. Dr. Horrible does it. Avengers does it, although the victim there gets resurrected for this TV show)
  • Characters in so much pain and angst that they can barely walk? (All of them)
  • Pop culture references? (For some reason they’re missing in Firefly; the rest has more than enough of them)
  • Actors from previous Joss Whedon works? (Avengers kind of averted that, but Cobie Smolders was at least a friend, and Clark Gregg got roped into appearing into Joss’s version of Much Ado About Nothing after Avengers was done).
  • A great sense of humor? (All of them)
  • Horrible things happening to people in love? (Avengers is still lacking that one. Be very afraid for Pepper in Avengers 2)

There are probably a few others that I missed.


  1. I am definitely not going to do all that punctuation stuff unless I’m getting paid for it. 

Hyperloop

I’ve been meaning to post this last week, but forgot. Hyperloop is a proposed new transportation system designed by Elon Musk, which focuses on small capsules traveling at high (up to 1200 km/h) speeds through partially evacuated steel tubes. There is a proposal available online. I believe it’s all crap, and I’m going to explain why.

Now, first of all, there is of course a chance that this blog post will look very embarrassing twenty years from now. One dismisses the founder of PayPal, Tesla motors and SpaceX at one’s own peril.

But on the other hand, it’s not like this hasn’t been tried before. Right from the moment the first trains appeared, people said “I can do better than that” and invented their own ‘improved’ designs. The Wuppertal Schwebebahn is a rare example that was actually built and then actually worked (and still does). Hyperloop’s approach in particular is very similar to a previous Swiss idea. However, that swiss idea did not have a compressor and was built in tunnels instead of elevated, and relied on much larger trains.

A more useful comparison is the Transrapid, a german high-speed Maglev train. It is beyond any doubt the world’s most successful high-speed Maglev train, because there exists one commercial installation, as an airport shuttle in Shanghai. Despite several decades of research and development, and the proof that the technology can work both in theory and in practice, the system has been ultimately abandoned and was really a waste of everyone’s money. Hyperloop is technologically very different from Transrapid, but many of the challenges are similar.

Can it work?

I have no idea whether the approach is actually workable. I seriously don’t know enough about any of the involved technology to say so. So I’ll have to give Hyperloop the benefit of the doubt and say that it probably can work from a technical point of view.

A railroad is not an island

That doesn’t mean it’s a good idea, though. The proposal focuses almost only on the technology, and does not discuss at all how to actually build a useful transportation system. An example:

The calculation presents two different models: One that is for passengers only, where the capsules have an exterior width of 1.4 m and height of 1.1 m, and another larger one. The smaller one relies on gullwing doors and seats that basically have the passengers lie down. Its main advantage is that it is a lot cheaper. Its main disadvantage is that it is illegal.

The reason for this is that such a system isn’t wheelchair accessible, because there is no space for a person seated in a wheelchair. We have thankfully finally gotten used to the idea that access for persons with reduced mobility is not something optional, but an essential part of preserving human dignity. It is neither realistic nor desirable to assume that the state of California would make an exception here.

Stations

Any actual transportation system (as opposed to a giant model railroad, which is what the current Hyperloop proposal shows) needs interfaces between it and the people who want to use it, or in other words, stations. These are not a thing that just happens, but rather an integral part of the overall design, and they require that you give them a lot of thought.

Hyperloop really doesn’t. It gives a vague explanation of the proposed experience. However, it lacks any details. Will there be shops? Parking spaces? Will stations be located in city centers, or on the outskirts? What about connections to existing rail systems, trams, or buses? What about evacuation in case of a fire, or about people opposing the construction? All these are real, important and expensive concerns. Just ask the people building new transportation hubs in Stuttgart or Berlin…

Switches

All the people who try to reinvent trains sooner or later hit upon a simple fact: The standard railroad switch is totally amazing. It requires only the space for the two tracks, and only a very tiny part of the overall construction has to move. They can change directions in about ten seconds. Any monorail approach has to either replace a part of the track, or bend it to a new direction. The Wuppertal Schwebebahn I mentioned above replaces the track; the Transrapid bends it. Both take much more energy and time.

Hyperloop hasn’t hit on that fact yet; reading the paper, it seems that the authors simply assume that switches exist, without thinking at all about it. That is dangerous, because a switch for Hyperloop may be the most complicated kind of switch invented yet. Not only does it require very large turning cycles, it also has to be completely air tight at either end, and you either have to keep the vacuum while it is switching, or restore it after you’re done. Switches are definitely necessary on this system, because there are several proposed spurs to minor californian cities.

Note that switches also have to be incredibly fast. Specifically, after a pod has cleared the switch, it must have completed its direction change before the next pod would have to start breaking in order to stop before the switch. This is because one cannot send a pod towards a currently changing switch, just hoping that it will be in a valid position when the pod arrives. This is completely unfeasible at the maximum 30 second distances between pods; one would have to group them by destination and leave a gap in between.

Safety

You just have to love the chapter on safety. It’s basically three paragraphs going “yeah, this is basically safe, so there”. There are no points on procedures or technologically, just notices that the track is completely enclosed, that trains can’t be accelerated to unsafe speeds, and that there are no pesky humans interfering. That is completely unacceptable.

How is train separation achieved? You can, of course, easily send pods through a tube at 1000 km/h in two-minute intervals and just hope that they all come back out in the same order. The authorities might even allow it if you pay enough bribes and don’t want to carry people. In real life, every capsule needs to be able to come to a controlled stop if the capsule before it does not clear the area it is currently at in time. If the system assumes thirty second headways at peak times, that means a capsule must be able to come to a complete stop within thirty seconds (because after that it would be where the stopped capsule is now). At 1200 km/h (= 333.3… m/s), that works out to a minimum emergency brake deceleration of 11.1… m/s, or slightly above 1g. In reality, you’d want some additional buffer, which only increases the forces. Those seat belts in the pods turn out to be absolutely essential. Note that the proposal includes no specification for the braking system whatsoever, only a cost estimate of more than $70,000.

All this assumes a totally safe communications system that works at these speeds and these distances. This isn’t a technical impossibility, but it is not going to be cheap either. For comparison: The vehicle-side interface for ETCS, the new european train control system, comes to about 200,000 € per locomotive if you can find it really cheap.

NIMBY

The environmental impact discussion is even worse than the area on safety. It simply assumes that nobody in California will have any problems with the world’s highest pipeline passing through or near their back yards. That is rather optimistic. Bundling most of the track with a motorway is a nice solution, but not an option in built-up settlements, where other alignments need to be found.

While we’re on the topic of alignments: How will the system reach its stations? For normal high speed rail, that is easy: The tracks and stations are already in place. This is not true for any new system which needs to either cut through a lot of houses or build very expensive tunnels. The tunnel solution is favored for new high-speed rail stations in city centers, e.g. London St Pancras International. Like any tunneling, it has a tendency to drive the costs up significantly.

Capacity

The capacity of the system is at the low end average for high-speed rail. The proposed smaller system has a target average of 840 people per hour; for rush hour, that can be increased by a factor of four (to 3360 people per hour). For comparison: The low end is about equal to what currently drives on the Paris-London Eurostar line, an altogether very under-utilized system. In Germany, the Cologne-Frankfurt high speed line (itself not exactly bustling with activity) has a capacity of about 2800 passengers per hour at the current timetable, although lack of trains may make that difficult.

However, trains scale much better. There is no particular reason why London-Paris can’t see the same number of passengers as Cologne-Frankfurt, and both could easily deal with twice as many passengers, too. The upper end currently seen in Europe would be the french high speed network, which can see trains composed of two double-deck TGV units every three minutes in the best case. This works out to an average peak capacity of more than 20,000 passengers per hour. This is probably not actually sustained for an entire hour. Still, the system scales relatively gracefully to very high numbers of passengers, while Hyperloop has a hard limit not all that far from current standards.

Cost

The cost suggestions are incredibly ambitious, or in other words, not even remotely accurate. Missing are items like a safe communication and control system, switches, any sort of traffic control, station development and so on. The running costs have been completely ignored (other than amortization of the investment). Also ignored: Any cost for testing and system development. For the Transrapid, that cost was on the order of a billion euros.

I am also not convinced that the costs that have been calculated are accurate. The pod, for example, costs less than half of a modern regional-service DMU (those tend to cost more than $2 million a piece). While the pod it is smaller, it requires pressurization and a lot of new and untried technology. Perhaps a more useful comparison would be a business jet, which is also pressurized (but with a far lower pressure difference), small, and contains a lot of expensive technology. That makes the price for the pod seem even more unrealistic.

Conclusion

Hyperloop is an interesting technological toy, but it is very far from an actual transportation solution, and I am not convinced that it can ever get there.

DB darf nach London

Verschiedene Nachrichtenseiten verbreiten derzeit die Meldung, dass die Deutsche Bahn AG jetzt durch den Kanaltunnel fahren darf. Dabei werden sehr viele eng zusammenhängende Themen durcheinander gebracht. Ich wollte das mal etwas klarer machen.

Worum geht es? International agierende Bahnunternehmen (und je nach Land auch nationale) benötigen innerhalb der EU zwei Sicherheitsbescheinigungen: Teil A zeigt, dass man allgemein Züge fahren kann und gilt Europaweit. Teil B wird für jedes Land von der zuständigen Sicherheitsbehörde neu ausgestellt und bezeugt, dass man sich mit den Regeln des betreffenden Landes auskennt. Der Eurotunnel zählt hier als mehr-oder-weniger eigenes Land, mit einer eigenen Aufsichtsbehörde, der Intergovernmental Comission oder kurz IGC. Diese hat der Bahn jetzt das entsprechende Zertifikat Teil B ausgestellt. Ob die Bahn das Zertifikat für Frankreich schon hat weiß ich nicht, aber ich vermute ja. Für Großbritannien könnte man notfalls auf eine der britischen Tochterunternehmen zurückgreifen.

(Nebenbei: All dies bezieht sich nur auf den Personenverkehr. Im Güterverkehr darf DB Schenket UK, früher bekannt als EWS, schon seit Ewigkeiten durch den Tunnel fahren.)

All das sind äußerst dröge Verwaltungsakte. Der Skandal, der es lustig macht, kommt aus zwei verschiedenen Richtungen:

  • DB will Eurostar, dass zu 55% der französischen Staatsbahn SNCF gehört, Konkurrenz machen.
  • Völlig unabhängig davon hat sich Eurostar entschieden, neue Züge von Siemens zu kaufen, statt wie bisher bei Alstom in Frankreich.

Beide Probleme überlappen sich in so weit, dass DB und Eurostar fast baugleiche Züge beschaffen und einsetzen wollen, namentlich den Siemens Velaro. Bekannt ist der Zug in letzter Zeit dadurch, dass die Version für die Deutsche Bahn immer noch keine Zulassung hat und nicht eingesetzt werden darf. Die Eurostar-Variante ist doppelt so lang und es ist immer noch unklar, ob sie für Deutschland zugelassen wird oder nicht, und auch die Farben werden anders. Die relevanten Punkte sind aber größtenteils gleich, inklusive der Verspätung.

Beide Züge müssen unter anderem für Belgien, Frankreich und Großbritannien zugelassen werden, und eben auch für das “Land” Kanaltunnel. Für die normalen Länder gibt es verbindliche Zulassungsregeln, was ein Zug tun können muss, wobei das “wie” ziemlich frei gestellt ist. Für den Kanaltunnel waren die Regeln im Prinzip ein Verweis auf die bestehenden Eurostar-Züge mit einer Notiz “So in etwa”. Solche Züge werden aber nicht mehr gebaut oder nachgefragt, daher mussten die Regeln auf jeden Fall angepasst werden um neue, gleich sichere Züge mit anderen Konzepten zu erlauben.

Dies ist natürlich die Stellschraube, mit der die französische Regierung, welche die Hälfte der IGC stellt, ihre Interessen durchsetzen wollte. Wobei die Interessen durchaus nicht gleich gelagert waren. Die SNCF wollte keine Konkurrenz durch die DB, hat sich aber beim Eurostar-Zug sehr zurückgehalten. Ich persönlich denke, dass man es in der SNCF-Zentrale nur wenig Probleme damit hatte, Alstom zu zeigen, dass der bisherige Hoflieferant nicht unersetzbar ist. Die französische Regierung wiederum wollte auf jeden Fall Alstom stützen und hat das auch sehr deutlich gemacht.

Am Ende hat sich das ganze aber doch geklärt. DB gab ihren Plan auf, einen ICE von Frankfurt nach Marseille zu schicken, ohne die SNCF zu beteiligen. Die Verbindung wird jetzt als Kooperation beider betrieben und fährt mit TGV weil die neuen ICE noch nicht da sind. Im Gegenzug blockiert die SNCF nicht den ICE nach London. Währenddessen bestellte die SNCF mehr normale Doppelstock-TGVs bei Alstom, die daraufhin ihren Widerstand gegen die Siemens-Züge für Eurostar aufgaben - vermutlich sowieso eine gute Idee, da die bis dahin befragten Gerichte die Klage von Alstom durchweg abwiesen.

Fehlte also nur noch der Verwaltungsakt mit der Sicherheitsbescheinigung, was hiermit erledigt ist. Na ja, und noch ein paar andere Sachen:

  • Die Züge müssen natürlich für Frankreich, Großbritannien, den Kanaltunnel, Belgien, die Niederlande und teilweise Deutschland zugelassen werden. Das haben sie bis jetzt noch nicht geschafft und zumindest für den Kanaltunnel noch nicht mal begonnen.
  • Das Vereinigte Königreich verlangt, dass eine Grenzkontrolle stattfindet, bevor die Gäste in den Zug in den Tunnel gelassen werden. DB wollte zuerst die bewährte Vor-Schengen-Art anwenden: Kontrolle im fahrenden Zug. Das sahen die Briten anders. Die bestehenden Eurostar-Bahnhöfe haben extra Sicherheitszonen. Diese auf anderen Bahnhöfen nachzurüsten wird teuer und schwierig, wenn überhaupt. Und auch Leute, die nur nach Brüssel wollen, müssten dort hindurch.

Aber dann: Auf nach London!

Trains in Games

Many years ago, I had started a website dealing with trains in video games. Now I’ve finally updated it again, with reviews for Bioshock Infinite and Dishonored. I’ve also changed some other stuff.

Right now I’m wondering whether I’ll implement these changes here, too. In particular, I’m uncertain about the comments. Weeding out the spam used to take a long time, but that vanished when I started with my self-written spam filter. Sadly, that filter is way too strong. It has even killed comments by me at times. So I’m not all that certain about this yet.