Penalize inner-link U-turns (builds on #83) #88

stefanholder · 2016-12-13T09:06:57Z

Things to note

Requires Refactor unfavoring of virtual edges (#885) graphhopper#909
mm-uturn2.gpx matches without U-turns with gps_accuracy>=80. Otherwise, there is a U-turn at a real node, which we could address in a separate PR.
I squashed @kodonnell's first two commits from prevent inner-link U-turns #83 because the second commit fixes the line endings of the first commit.
Added a penalty to the path distance for each unfavored edge contained in the path.
Use non-normalized transition metric because the timeDiff==0 check didn't work with directed candidates and was broken anyway.
Shortest weighting does not use penalties, hence unfavoring virtual edges would not work there.
Implemented logging of candidates and paths. After fixing the remaining broken unit test, it probably makes sense to set the logging level to WARN for MapMatching.java.

When looking at the unit tests I was wondering why the map match results are only checked for the correct sequence of street names but not for the correct sequence of real nodes / real edges. Is there a reason for this?

@karussell, @kodonnell: When finally merging this PR, if you want to squash commits please only squash commits of the same author and please do not squash the entire PR so that it is still clear which commits were contributed by which author.

stefanholder · 2016-12-13T09:10:25Z

matching-core/src/test/java/com/graphhopper/matching/MapMatchingTest.java

-                .hints(new PMap().put(Parameters.CH.DISABLE, false))
+                .maxVisitedNodes(1000)
+                .hints(new PMap().put(Parameters.CH.DISABLE, false)
+                        //TODO Fix that CH routing still uses default penalty of 300


testUTurns for CH still fails because CH routing does not consider the set parameter value for Parameters.Routing.HEADING_PENALTY. @karussell, can you please help?

CH cannot consider this value in most cases due to theoretical limitations, but sometimes it works and you could try if here this would be the cases via put(CH.FORCE_HEADING, true) if not, then we would need to exclude the test for CH somehow.

put(CH.FORCE_HEADING, true) did not work, so I excluded the test for CH.

stefanholder · 2016-12-13T09:13:10Z

matching-core/src/main/java/com/graphhopper/matching/util/HmmProbabilities.java

     */
-    public double transitionLogProbability(double routeLength, double linearDistance,
-                                           double timeDiff) {
-        if (timeDiff == 0) {


For GPX traces with equal timestamps, all transitions had a probability of 1 and hence transitions were not considered during map matching. With directed candidates the siutation got even worse because it could happen that the Viterbi algorithm chose a candidate with wrong direction because penalties from unfavored edges would still result in a transition probability of 1. In this case the resulting map matching path would take unnecessary detours.

As discussed here I'd prefer a speed based limit where we break sequences (#87) if the speed between two candidates is too high. However, I think we can leave that until after this PR is merged.

As discussed here I would prefer that the user is able to specify if timestamps should be used. In some cases such as #13 there are just no (correct) timestamps available. However, checking speed between two candidates requires correct timestamps and should hence be optional.

because penalties from unfavored edges would still result in a transition probability of 1

@stefanholder would you mind to explain this to me? Do we just pick the distance then? But is this not somehow conflicting with the FastestWeighting we use?

With conflicting I mean e.g. when we calculate the correct fastest path going on a motorway and then still prefer the track besides the motorway which is shorter (but slower), because the matchings were slightly closer to the track. But using the fastest weighting for me implies that we should prefer matching to fast roads even if there are no time stamps attached. So more generally I think we should even use path.getWeight() so that one could in theory use a completely different road preference (not sure if this is useful, so maybe we should stick to time/fastest)

So more generally I think we should even use path.getWeight()

Good idea. I think we're all agreed that it'd be good to allow the user to configure their transition metric preference (or at least base it on the algoOptions). Maybe a separate PR?

But using the fastest weighting for me implies that we should prefer matching to fast roads even if there are no time stamps attached.

What I tried to explain in the original comment was that the timeDiff==0 check was broken and that the consequences would be even worse for directed candidates so this had to be fixed. The code change above makes sure that penalties from unfavored edges are also considered if all timestamps are equal.

With FastestWeighting and the route length transition metric we are currently using the map matcher favors the shortest routes out of all fastest routes between candidate pairs (independent of whether timestamps are equal or not). This means that the map matcher actually achieves a compromise between fast and short routes. Moreover, we are using the transition metric suggested and verified by Newson & Krumm. Changing this should be done in a separate PR and carefully evaluated. (This discussion actually belongs into #86.)

stefanholder · 2016-12-13T09:30:34Z

The Travis CI build has a problem with the maven-failsafe-plugin, which should be unrelated to my changes. @karussell, can you please have a look?

karussell · 2016-12-13T10:09:18Z

When finally merging this PR, if you want to squash commits please only squash commits of the same author and please do not squash the entire PR so that it is still clear which commits were contributed by which author.

I'm sorry. This happened before to @michaz too, which I'm very sorry about. I didn't indent to remove the authorship or something. We could explicitly add this here.

The problem also is that github offer this as default option which I should change now. Also squashing all commits into one is a lot simpler than making this author by author, maybe we should aim for just very few commits per PR and avoid squashing at all.

stefanholder · 2016-12-13T10:17:21Z

I'm sorry. This happened before to @michaz too, which I'm very sorry about. I didn't indent to remove the authorship or something.

No problem. Actually, it didn't happen to my commits but I know that the github squashing feature exist and I wanted to make you and other maintainers aware of the consequences.

maybe we should aim for just very few commits per PR and avoid squashing at all.

Sure, just let me know if you want me to squash my commits. For reviewing, it is sometimes better to have more commits, e.g. having refactorings in separate commits. This also allows to easily revert certain commits.

karussell · 2016-12-13T10:35:14Z

Ok. I think the only improvement that squashing does is that a new feature is condensed into as few as possible commits so that it also can be easily reverted. But as I didn't use squashing ~6 months ago I think we can revert to this behaviour without tweaking authorship.

The Travis CI build has a problem with the maven-failsafe-plugin, which should be unrelated to my changes. @karussell, can you please have a look?

Looks like retriggering solved this, but we need a new GH core version deployed before.

karussell

Please see my comments

karussell · 2016-12-13T10:41:29Z

matching-core/src/main/java/com/graphhopper/matching/MapMatching.java

+    // Penalty in m for each U-turn performed in the path between two subsequent candidates
+    // Note that this penalty should roughly correspond to the penalty used for unfavored
+    // virtual edges because results get inconsistent otherwise.
+    private static final double VIRTUAL_NODE_U_TURN_PENALTY = 100;


Can we try to make it consistent with the u-turn penalty used here or here?

I.e. using the same unit (seconds) and value. If both values should be similar, what happens if the heading penalty is changed?

The current as well as the previous transition metric use path lengths in m. We could later try to change this to path times as discuss in #86 but this should be done in a separate PR.

What we could do now is to convert HEADING_PENALTY in s into a heading penalty in m by assuming a default speed. @karussell, do you have a suggestion how to set the default speed?

I'm happy with it in meters - as @stefanholder says, it makes sense given the transition metric. Maybe we should allow the user to change it, however?

I'm now converting HEADING_PENALTY time into m using a conversion speed. This also allows the user to change the penalty by adjusting HEADING_PENALTY.

karussell · 2016-12-13T10:41:53Z

matching-core/src/main/java/com/graphhopper/matching/MapMatching.java

    private final Graph routingGraph;
    private final LocationIndexMatch locationIndex;
    private double measurementErrorSigma = 50.0;
-    private double transitionProbabilityBeta = 0.00959442;
+    private double transitionProbabilityBeta = 2.0;


Is this necessary in this PR or should we create a new one?

This parameter needed to be adjusted because because the normalized transition metric was changed to the non-normalized transition metric. This was necessary because the normalized transition metric divides by the squared time difference but for some traces such as #13, all time stamps are equal. The previous way to deal with this doesn't work with directed candidates and also has other problems as explained above.

karussell · 2016-12-13T10:43:02Z

matching-core/src/main/java/com/graphhopper/matching/MapMatching.java

+            allQueryResults.addAll(qrs);
+        queryGraph.lookup(allQueryResults);
+
+        logger.info("================= Query results =================");


frequent logging should be avoided. Could we use a debug switch instead here?

Ok, I will use a static final variable that is checked before calling the logger.

Hmmh, but with many logging statements we make it more unreadable. Especially the core is just a library and should not log (or really only if necessary or informative and this then can be configured via log config). Logging is often (of course not always :) !!) a sign that certain parts are not easy to test - maybe we should fix this?

Actually logger.debug should be called instead of logger.info for the logging in MapMatching.java. Then this logging can be easily configured via log config and we don't need a switch anymore (unless we are worried about performance).

IMO, the added logging statements make sense because they help to figure out what is really going on during map matching. This is always useful when a trace is not map matched as expected. I don't think that more tests would replace this need for logging.

Ok. logger.debug sounds reasonable to me. Furthermore we should create a good debugging experience e.g. using something like the MiniGraphUI which requires also to move this project into the core to avoid circle deps.

Ok, I will change this to logger.debug. A graph UI would be great indeed.

Ding for #78 (which I've already somewhat implemented privately) - though assuming #792 is going to happen soon, we may want to wait.

karussell · 2016-12-13T10:44:55Z

matching-core/src/main/java/com/graphhopper/matching/MapMatching.java

+    /**
+     * Find the possible locations of each qpxEntry in the graph.
+     */
+    private List<List<QueryResult>> findGPXEntriesInGraph(List<GPXEntry> gpxList,


It should be named like findQueryResults or lookupGPXEntries, findGPXEntries implies that we return something like List<GPXEntry> (?)

Ok, I can take care of this. This will result in a separate commit because @kodonnell added this method and I don't want to change his commit.

It should be named like findQueryResults or lookupGPXEntries, findGPXEntries implies that we return something like List (?)

That's true, lookupGPXEntries is probably better.

karussell · 2016-12-13T10:46:08Z

matching-core/src/main/java/com/graphhopper/matching/MapMatching.java

+                                    queryGraph.getEdgeIteratorState(iter.getEdge(), iter.getAdjNode()));
+                        }
+                    }
+                    assert virtualEdges.size() == 2;


Can we throw an exception (IllegalStateException?) here with a detailed message?

I can throw an exception here but IllegalStateException should only be used for violated preconditions. Its javadoc says:

Signals that a method has been invoked at an illegal or inappropriate time.

However, this is an internal error. I would just use RuntimeExcption. Is this ok with you?

My mistakes there - I didn't realise assertions are ignored without the -ea switch

karussell · 2016-12-13T10:53:20Z

matching-core/src/test/java/com/graphhopper/matching/util/HmmProbabilitiesTest.java

-    public void testTransitionLogProbability() {
-        HmmProbabilities instance = new HmmProbabilities();
-        // see #13 for the real world problem
-        assertEquals(0, instance.transitionLogProbability(1, 1, 0), 0.001);


Why did you removed this? Please see #13 and #75

Because I needed to remove the timeDiff check (see above), which made this unit test obsolete.

karussell · 2016-12-13T10:54:17Z

matching-web/src/main/java/com/graphhopper/matching/http/MatchResultToJson.java

-                wpt.put("x", extension.getQueryResult().getSnappedPoint().lon);
-                wpt.put("y", extension.getQueryResult().getSnappedPoint().lat);
-                wpt.put("timestamp", extension.getEntry().getTime());
+                wpt.put("x", extension.queryResult.getSnappedPoint().lon);


Should we enforce calling getQueryResult() and getEntry() instead?

@kodonnell, can you please answer this?

Doesn't worry me at all, and I don't know enough about Java to say which is good practice. So happy to leave to @karussell.

Sorry, this was actually my change. Same answer as here.

I'm usually a friend of public final stuff except where we guess that we could use subclasses and method overwriting, but the problem here is that this breaks API compatiblity for no (good) reason IMO

Ok, I wasn't aware that GPXExtension is part of the public API. For me it was mainly the candidate representation but for the user it is the map matched GPX entry. I will then use getters for all fields in GPXExtension.

Ok, I wasn't aware that GPXExtension is part of the public API

Every public class is available from others and forms our public API. This will probably change in java 9

karussell · 2016-12-13T10:54:54Z

matching-web/src/main/java/com/graphhopper/matching/http/MatchServlet.java

@@ -64,6 +64,7 @@
    public void doPost(HttpServletRequest httpReq, HttpServletResponse httpRes)
            throws ServletException, IOException {

+    	logger.info("posted");


also here logs should be removed

@karussell, @kodonnell: Should we remove this or change to logger.debug (change was done by @kodonnell)?

Happy to get rid of it - I shouldn't have left it in there. (I couldn't figure out how to get any debug output from running the GUI, and that was me trying, I think.)

karussell · 2016-12-13T10:55:15Z

matching-core/src/main/java/com/graphhopper/matching/GPXExtension.java

-                + ", gpxListIndex:" + gpxListIndex;
-    }
-
-    public QueryResult getQueryResult() {


why is this removed?

Because queryResult is now public final. This achieves the same as this getter with less boilerplate code.

Good point - I prefer this in terms of code cleanliness/development (though I'm not sure if it's good/bad practice etc.).

karussell · 2016-12-13T10:56:16Z

matching-core/src/main/java/com/graphhopper/matching/GPXExtension.java

+    public final QueryResult queryResult;
+    public final boolean isDirected;
+    public final VirtualEdgeIteratorState incomingVirtualEdge;
+    public final VirtualEdgeIteratorState outgoingVirtualEdge;


I do not like having the virtual edges public, can we reduce visibility or use EdgeIteratorState instead?

Ok, I will change this to EdgeIteratorState.

kodonnell · 2016-12-13T20:21:51Z

First - nice work @stefanholder! Looks like it was pretty involved, so thanks for sticking with it. If you'd like, I'm happy to review, though it may be a few days. @karussell - can you let me know whether you'd prefer me to review this, or work on #87?

if you want to squash commits

I still don't know what squashing is, so you're safe from me!

karussell · 2016-12-13T21:00:02Z

@karussell - can you let me know whether you'd prefer me to review this, or work on #87?

I'd prioritize this one here but this should not be in my power - please do what you think is best or fun :) !

kodonnell

I'm still not sure this will work - but I'm happy to be led by proof, and this does seem to work for mm-uturns1. However, @stefanholder can you verify you also get the same results as below for mm-uturns2?

That's my only real concern - otherwise just a few implementation details as below. It's not noted there, but @stefanholder - is it possible to filter the issue 70 OSM data to only include relevant stuff (as per here)?

@karussell - it'd be nice to test with #73 = ) I can manually merge stuff to try, but it'd be easier if it was in the repo etc.

kodonnell · 2016-12-16T21:09:15Z

matching-core/src/main/java/com/graphhopper/matching/MapMatching.java

+            if (penalizedEdges.contains(edge.getEdge())) {
+                totalPenalty += VIRTUAL_NODE_U_TURN_PENALTY;
+            }
+        }


Couple of points:

shouldn't we only be checking first/last path edge? I can't see why the middle edges would ever be penalized - and it'll be faster if we avoid checking them.

wondering if a HashSet is overkill (and slow). At least we could use e.g. trove IntHashSet (on edge ID). Or, if penalizedEdges is length = 2 (or 4?) and we're only comparing two edges, a loop is probably faster (and avoids allocating a new hash etc.). Maybe that's premature optimization though.

wondering if a HashSet is overkill (and slow). At least we could use e.g. trove IntHashSet (on edge ID)

HashSet is super fast for smaller sets (in Ks or even millions) and as we do not put all nodes in there it should be preferred because an array requires to be of the size of the graph which can be in the range of 100 millions.

What we could slightly improve is doing new HashSet(penalizedVirtualEdges.size()) ... but with this change we do not need to copy it?

My point was more that we'll only ever have 2 (or 4?) entries in the set, and only calling 'contains' twice (i.e. first/last edge of path), so I'm not sure we need a hashset. (And, if we are using one, we might as well just hash the edge IDs, hence TIntHashSet. Or we could even use int[].) However, if it's premature optimization, I'm more than happy to ignore it.

An array with just two entries and then calling contains should be indeed faster. Sorry I misunderstood this.

but with this change we do not need to copy it?

Good point, I will just use the set returned by QueryGraph.getUnfavoredVirtualEdges. @kodonnell, anything else would indeed be premature optimization IMO.

shouldn't we only be checking first/last path edge?

Good point, I changed this.

kodonnell · 2016-12-16T21:25:07Z

matching-core/src/main/java/com/graphhopper/matching/MapMatching.java

+                        // get favored/unfavored edges:
+                        VirtualEdgeIteratorState incomingVirtualEdge = j == 0 ? e1 : e2;
+                        VirtualEdgeIteratorState outgoingVirtualEdge = j == 0 ? e2 : e1;
+                        // create candidate


We're ignoring the possibility of incomingVirtualEdge == outgoingVirtualEdge (i.e. a U-turn). @stefanholder - can you comment? My suspicion is it wouldn't change the results - but if that's the case, can we maybe make a note to explain to anyone else who notes this?

Do you mean it's possible that incomingVirtualEdge==outgoingVirtualEdge? I think it's not possible that the EdgeIterator returns the same edge twice and we are already asserting that we get exactly 2 virtual edges from the EdgeIterator. But if you want I can add another check here to make sure that incomingVirtualEdge!=outgoingVirtualEdge.

I mean that it's a perfectly valid route - I go in on the same edge I leave (i.e. U-turn). My gut feeling is we should include these as candidates (which will be penalized) ... I worry that specifically ignoring them might e.g. make U-turns impossible. However, it maybe just be an implementation thing, and we don't actually need to do it. So happy for you to make the call on this one.

If we created a candidate with incomingVirtualEdge==outgoingVirtualEdge then this candidate would actually allow to perform a U-turn without a penalty by going to and from the virtual node through the other virtual edge pair.

Understood. If it's not discussed elsewhere, and you think it worthwhile, could you update the comment just to note that we don't need to consider these cases?

I updated the comment.

stefanholder · 2016-12-19T06:29:17Z

However, @stefanholder can you verify you also get the same results as below for mm-uturns2?

I get the same results if I use issue-70.osm.gz as map data. The reason is that the road, the map matcher should match to, is not contained in the map file. The trace matches correctly with the entire Serbia map.

@stefanholder - is it possible to filter the issue 70 OSM data to only include relevant stuff (as per here)?

The size of issue-70.osm.gz is 11 KB, so is this really a concern? If yes, could you do this as another commit to branch issue70 and maybe also include the map data for mm-uturns2? I will then squash this commit into your first commit in this PR.

stefanholder · 2016-12-19T08:48:35Z

I updated my commits with the changes discussed above and squashed some commits. If you want to see only what has changed you can compare with branch issue70-penalize-paths-v0.1. The comparison doesn't work in the Github UI because both branches diverged but works in IntelliJ, for example.

karussell

Nice, looks good to me.

I would like to get feedback from @michaz, do you (still) think this could/should be solved differently?

karussell · 2016-12-19T08:52:42Z

matching-core/src/main/java/com/graphhopper/matching/GPXExtension.java

-    public GPXEntry getEntry() {
-        return entry;
+    /**
+     * Returns true if the snapped point is a virtual node, otherwise returns false.


Is isDirected the best wording here? At least the javadocs should be explicitely say 'true if the snapped point is a virtual node, otherwise it is a tower node and returns false'?

I think that isDirected is the best wording here because there are directed and undirected candidates (GPXExtensions). Hence the javadoc (implicitly) says that all virtual nodes are directed and all real nodes are undirected.

Sure, I can change the javadoc to what you suggested.

I'd also suggest renaming it to isVirtual if "virtual node" is the GraphHopper core terminology.
In addition, we should clarify/resolve this terminology in QueryGraph, where virtual edges are introduced. Currently, neither 'directed' nor 'undirected' appears there. It says that four new edges are inserted: "base-snap, snap-base, [...]", which suggests directed edges, whereas edges in the base graph are undirected. That's a rather useful thing to know, and we should be more explicit about it.

I see your point but from a map matching point of view, we either have directed or undirected candidates. Currently, all virtual nodes are transformed into directed candidates but we might later have directed candidates also for real nodes later as discussed before. So the method name isDirected would be more stable than isVirtual. Also the javadoc of this method is meant to be read in combination with the class javadoc, which should explain the whole picture.

Updated the javadoc of isDirected.

karussell · 2016-12-19T08:53:30Z

matching-core/src/main/java/com/graphhopper/matching/GPXExtension.java

+    }
+
+    /**
+     * Returns null if this GPXExtension is not directed.


Should we instead throw an exception to enforce that isDirected is always called before?

Sure, this is probably safer to do.

I'm not super-particular about these kinds of things. There's not a lot which can go wrong here -- the NullPointerException you will get is pretty clear in its meaning.

Yes, but the NullPointerException could be thrown much later than the IllegalStateException (potentially after null is passed a couple of times), which can make debugging hard.

michaz · 2016-12-19T11:02:22Z

I would like to get feedback from @michaz, do you (still) think this could/should be solved differently?

No no, I I still like it. My concerns were more with putting other distance-like things (like time) instead of distance into the transition probabilities.

Just so I understand this correctly, and for future efforts: The reasons we are unfavoring the wrongly-directed virtual edges, rather than just not inserting them, are that

1a.) QueryGraph is the way it is.
2a.) We want to allow inner-link U-turns, just with low probability.

Similarly, the special case for "matching to a real node" is

1b.) because QueryGraph is the way it is.
2b.) because in the cases this happens, it may give better performance

3.) I think there may be some interference between unrelated things. We use one QueryGraph for everything, so every candidate introduces inner-link U-turn possibilities on routes between completely different candidates.

So, if we could tell QueryGraph to always create a pair of directed edge-matches (2 * (virtual node + 2 virtual edges)), it would fix 3.) and be more straight-forward, but at the cost of missing 2a) and 2b)?

karussell · 2016-12-19T11:54:02Z

2a.) We want to allow inner-link U-turns, just with low probability.

Yes, I think it is this case

Similarly, the special case for "matching to a real node" is

Here kind of both apply. We can also improve QueryGraph, but avoiding to create virtual nodes is good for performance especially for multiple hundreds of points. But we probably should test it before using it as an argument :)

We use one QueryGraph for everything, so every candidate introduces inner-link U-turn possibilities on routes between completely different candidates.

Here @stefanholder uses the 'unfavouring' stuff to clear and avoid interfering I think, still making QueryGraph stateful removes possibility to do matching on multiple threads (with the same QueryGraph).

So, if we could tell QueryGraph to always create a pair of directed edge-matches

You mean always creating virtual nodes&edges?

stefanholder · 2016-12-19T12:02:29Z

The reasons we are unfavoring the wrongly-directed virtual edges, rather than just not inserting them, are that
1a.) QueryGraph is the way it is.
2a.) We want to allow inner-link U-turns, just with low probability.

The main reason is that during candidate generation we don't know yet what the wrongly-directed virtual edges are. This is determined by the Viterbi algorithm after the router computed the penalized path lengths. Moreover, both directed candidates can be valid choices even without any U-turn if the road is not a dead end.

Similarly, the special case for "matching to a real node" is
1b.) because QueryGraph is the way it is.
2b.) because in the cases this happens, it may give better performance

I would say because we first wanted to address inner-link U-turns in this PR and not U-turns at real intersections. However, this shouldn't be hard to do in another PR if really needed.

3.) I think there may be some interference between unrelated things. We use one QueryGraph for everything, so every candidate introduces inner-link U-turn possibilities on routes between completely different candidates.

Additional virtual nodes in the QueryGraph from other candidates shouldn't be a problem because

Since virtual nodes retrieved by findNClosest are perpendicular to GPS positions, we don't have additional candidates for any GPS position
inner-link U-turns can only happen at the beginning or end of a path but not at virtual nodes from other candidates (otherwise this path wouldn't be the fastest or shortest including penalties from unfavored edges)

michaz · 2016-12-19T12:28:53Z

You mean always creating virtual nodes&edges?

I mean instead of the QueryGraph doing this

and then doing the direction-thing in the map matching code...

...maybe have the QueryGraph do this

and then do nothing more in the map matching code.

And yes, treat all the matches like this. No special case for being EPSILON next to a real node.

michaz · 2016-12-19T12:32:59Z

Later, I mean. Not now.

stefanholder · 2016-12-19T13:23:18Z

...maybe have the QueryGraph do this

Interesting idea, this would make virtual edges truly unidirectional. From my understanding of Graphhopper internals, this would mean splitting the flags field into two flags fields - one for each direction.

stefanholder · 2016-12-19T13:31:34Z

But as you wrote before, this would completely prevent inner-link U-turns instead of penalizing these. Since penalizing U-turns is more powerful than preventing U-turns, I think we should stick with penalties for inner-link U-turns.

karussell · 2016-12-19T14:18:19Z

Interesting. I also favour penalizing u-turns over completely avoiding them. Also we would introduce a bit complexity due to two candidates '7a' and '7b' instead of one?

We could improve clarity in a later PR via fixing QueryGraph to use just two bidirectional edges (instead of 4 unidir. edges) and use the edge based traversal mode, then we could use the TurnWeighting to avoid u-turns without a special virtual edge handling in the map matching core I guess. The costs would be that the full algorithm will be roughly 2 times slower but for motor vehicles we have to consider turn costs anyway.

michaz · 2016-12-19T14:31:19Z

Allright, I'm just saying: The whole QueryGraph-virtual-node-virtual-edge-business was just what I found to be there when I did the prototype and noticed that GraphHopper generally "thinks undirectedly", (which I don't, I always think of road segments as two opposing directed edges, and storing them as one data record is just a compressed representation of that, which should ideally be abstracted away in the layer directly above data storage, wherever possible).

I did not intend it to model U-turns, this is just what happened.

If I had found a way in QueryGraph to get positions on directed-edge-like-things, I would have immediately used that.

stefanholder · 2016-12-19T14:34:09Z

then we could use the TurnWeighting to avoid u-turns without a special virtual edge handling in the map matching core I guess

Also with TurnWeighting we need two candidates per virtual node, one for each direction. This is because we don't know the correct direction at candidate generation (see above). Hence, I'm not sure what the benefit of TurnWeighting would be. Still it would be nice to support all traversal modes for map matching.

I did not intend it to model U-turns, this is just what happened.

Sure, that's what I thought.

karussell · 2016-12-19T14:56:03Z

but not for the correct sequence of real nodes / real edges, which is suboptimal when there are no street names such as here. Is there a reason for this?

I'm not 100% sure about the reason besides being simpler to debug. But it could be the following: although nodes&edges are stable if we do not change the source map, they could still change if something changes while import and checking just streets is more robust. The best solution which I would also like to include in the public web API would be to use OSM nodes which are okay to debug and relative stable (OSM way IDs are not that stable).

…hopper#70) For GPX traces with equal timestamps, all transitions had a probability of 1 and hence transitions were not considered during map matching. With directed candidates the siutation got even worse because it could happen that the Viterbi algorithm chose a candidate with wrong direction because penalties from unfavored edges would still result in a transition probability of 1. In this case the resulting map matching path would take unnecessary detours.

stefanholder · 2016-12-20T12:44:49Z

I updated my commits with the changes discussed above. The previous commits can be found in branch issue70-penalize-paths-v0.2.

karussell

Looks good to me. The car example here (also the official example of the Dir API) is also fixed with this - there were 2 obstacles before and now is a perfect match - nice! (I used this extract for the pbf)

The problem is now how we measure quality (but @kodonnell opened already an issue for this). A smaller difference is an indicator but in this specific example the distance difference now is bigger!

karussell · 2016-12-20T17:06:13Z

Please let me know a +1 and I'll merge @michaz & @kodonnell

stefanholder · 2016-12-23T07:55:11Z

@karussell, your last comment got thumbs up from both @michaz and @kodonnell (there is no email notification for this). Not sure if this is what you meant with +1.

karussell · 2016-12-23T09:00:55Z

Thanks!

there is no email notification for this

How ugly. They could at least give a notification in the web UI ... this is what discourse does (in general they have a much better notification system ...)

karussell · 2016-12-23T09:01:19Z

Merged - thanks again all involved :) !

stefanholder · 2016-12-23T10:44:25Z

How ugly.

Yes, it's probably better that all reviewers approve a review via github pull request reviews.

kodonnell · 2016-12-24T00:28:17Z

As an aside, as #73 was merged before this (thanks @karussell), we've got the following two measurement results. (We can't easily go back further.)

commits:
--------------------------

3cd6476 [2016-12-20] Performance measurement suite (#73)
d3cf21e [2016-12-23] Merge pull request #88 from stefanholder/issue70-penalize-paths

measurements:
-------------

                              3cd6476             d3cf21e             
                              -------             -------             
location_index_match.max      10.756949           5.571446            
location_index_match.mean     0.3402709212        0.19900762400000002 
location_index_match.min      0.001734            0.001277            
location_index_match.sum      1701.354606         995.03812           
map_match.max                 1645.143509         2409.088762         
map_match.mean                127.85747168        274.09888926        
map_match.min                 1.564703            3.390409            
map_match.sum                 12785.747168        27409.888926        
measurement.count             5000                5000                
measurement.seed              123                 123                 
measurement.time              19071               36710               
measurement.totalMB           1529                1524                
measurement.usedMB            50                  37

So, roughly speaking, the locationIndex is a little under twice as fast, but the map-matching is a little over twice as slow.

karussell · 2016-12-24T10:51:57Z

This looks strange. Did you run it twice and were the results consistent? Maybe there are some problems in the measurement suite. E.g. why should the location index be faster?

kodonnell · 2016-12-24T22:11:04Z

Did you run it twice and were the results consistent?

No - I'd assumed all the warm-up and multiple testing would make it OK - as below, I was wrong = ) Not sure if it's a bug ... though maybe there could have been varying demands on the system (just my laptop) or some such. Anyway, I've run it multiple times, and consistently got results similar to below:

commits:
--------------------------

3cd6476 [2016-12-20] Performance measurement suite (#73)
d3cf21e [2016-12-23] Merge pull request #88 from stefanholder/issue70-penalize-paths

measurements:
-------------

                              3cd6476             d3cf21e             
                              -------             -------             
location_index_match.max      5.239315            5.63816             
location_index_match.mean     0.1950378346        0.19318890019999999 
location_index_match.min      0.001261            0.001284            
location_index_match.sum      975.189173          965.944501          
map_match.max                 1619.534054         2392.89043          
map_match.mean                127.55285619        271.12588989        
map_match.min                 1.563381            3.044777            
map_match.sum                 12755.285619        27112.588989        
measurement.count             5000                5000                
measurement.seed              123                 123                 
measurement.time              18018               35912               
measurement.totalMB           1526                1522                
measurement.usedMB            37                  37

That said, about half the time the used MB is like above, and the other half, the latest commit is 50% higher.

So ... locationIndex is the same (as expected) though map matching is slower (which is expected, given we're increasing the number of candidates). For now, I'm guessing we're not too worried by the increased runtime (as it comes with a new feature)?

karussell · 2016-12-28T11:46:21Z

No - I'd assumed all the warm-up and multiple testing would make it OK

It should. I expect the measurement suite is suboptimal somewhere

E.g. the while loop is something that I do not like much. With the new algorithm we should make every matching working I think and throw an error for problems.

For now, I'm guessing we're not too worried by the increased runtime (as it comes with a new feature)?

It depends a bit and if we understand the underlying issue. We should try as hard as possible to avoid slow down. Otherwise we end up in powerful but slow software at some point.

which is expected, given we're increasing the number of candidates

Are we increasing them?

kodonnell · 2016-12-30T18:58:08Z

Are we increasing them?

My understanding is that every virtual node (a single candidate before) becomes (two) directed candidates (virtual node + direction). Assuming every node is a virtual one, then that's twice as many candidates, and four times as many transitions (which is likely to be the slow part, as it's the routing). The viterbi processing will also be slowed down due to the increased number of candidates (and may require more memory to store everything in the maps). At this stage, I'm assuming we're only seeing a 2.5x slowdown (as opposed to 4x or more) because not every candidate is virtual.

the while loop is something that I do not like much. With the new algorithm we should make every matching working I think and throw an error for problems.

Do you mean this one? Agreed, with #87 we should (?) hopefully never fail - at worst, we get returned a list of single point sequences. That said, another option is to tweak MiniPerfTest to (optionally) catch exceptions and count/report them in the final stats, and then we could exclude the while loop. I'd only do that if it was useful in other situations.

karussell · 2016-12-31T09:35:16Z

Assuming every node is a virtual one, then that's twice as many candidates, and four times as many transitions

Ups, indeed. BTW: We should put the one-to-many "cache" to make this a lot faster (I hope). Was this meant with #81 ?

Do you mean this one?

yes

Agreed,with #87 we should (?) hopefully never fail

Then I'd prefer the procedure you describe (throwing vs. counting errors) and then (a slightly modified) measurement suite could also act as a quality indicator like in #89

kodonnell · 2016-12-31T17:46:21Z

Was this meant with #81 ?

I'm not sure what you mean, sorry.

Then I'd prefer

OK, I'll try to remember to tweak it as part of #87.

karussell · 2016-12-31T20:14:46Z

OK, I'll try to remember to tweak it as part of #87

Better in a separate PR

stefanholder · 2017-01-12T14:11:24Z

We should use a one-to-many Dijkstra algorithm, which only aborts after all target nodes have been found. I think this was meant with #82. @karussell: Is this also what you meant with one-to-many "cache" above?

karussell · 2017-01-12T14:15:41Z

@karussell: Is this also what you meant with one-to-many "cache" above?

I meant exactly this, yes. But I'm not sure of intentions of the issue creator :)

kodonnell · 2017-01-12T19:27:45Z

But I'm not sure of intentions of the issue creator :)

The intent of #82 was to use DijkstraOneToMany.

WIP: playing round with directed candidates for graphhopper#70

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.

GPG key ID: B5690EEEBB952194

Learn about vigilant mode

f0afa5f

stefanholder commented Dec 13, 2016

View reviewed changes

stefanholder changed the title ~~Issue70 penalize paths~~ Penalize inner-link U-turns (builds on #83) Dec 13, 2016

stefanholder mentioned this pull request Dec 13, 2016

prevent inner-link U-turns #83

Closed

karussell reviewed Dec 13, 2016

View reviewed changes

kodonnell mentioned this pull request Dec 13, 2016

compute candidates on the fly if possible? #90

Open

kodonnell mentioned this pull request Dec 13, 2016

suboptimal (?) transition probability metric #86

Open

kodonnell reviewed Dec 16, 2016

View reviewed changes

Finish penalizing inner-link U-turns (graphhopper#70)

e2fe456

stefanholder force-pushed the issue70-penalize-paths branch from e4ad3a2 to 1ea7565 Compare December 19, 2016 08:36

karussell suggested changes Dec 19, 2016

View reviewed changes

stefanholder self-assigned this Dec 19, 2016

stefanholder added 3 commits December 20, 2016 07:44

Improve javadoc, refactor and clean up (graphhopper#70)

c0d3323

Fix unit tests (graphhopper#70)

58417cd

stefanholder force-pushed the issue70-penalize-paths branch from 1ea7565 to 58417cd Compare December 20, 2016 12:39

karussell approved these changes Dec 20, 2016

View reviewed changes

karussell merged commit d3cf21e into graphhopper:master Dec 23, 2016

karussell added this to the 0.9 milestone Dec 23, 2016

karussell added the improvement label Dec 23, 2016

karussell mentioned this pull request Dec 28, 2016

Avoid unnecessary u-turns #70

Closed

stefanholder mentioned this pull request Jan 12, 2017

Improve javadoc comments graphhopper/graphhopper#932

Merged

stefanholder mentioned this pull request Jun 14, 2017

Investigate performance slow down #97

Closed

Penalize inner-link U-turns (builds on #83) #88

Penalize inner-link U-turns (builds on #83) #88

Conversation

stefanholder commented Dec 13, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

stefanholder commented Dec 13, 2016

karussell commented Dec 13, 2016

stefanholder commented Dec 13, 2016 • edited Loading

karussell commented Dec 13, 2016

karussell left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kodonnell commented Dec 13, 2016

karussell commented Dec 13, 2016

kodonnell left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

karussell Dec 17, 2016 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

stefanholder Dec 19, 2016 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

stefanholder Dec 19, 2016 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

stefanholder Dec 19, 2016 • edited Loading

Choose a reason for hiding this comment

stefanholder commented Dec 13, 2016 •

edited

Loading

kodonnell left a comment •

edited

Loading

karussell Dec 17, 2016 •

edited

Loading

stefanholder Dec 19, 2016 •

edited

Loading

stefanholder Dec 19, 2016 •

edited

Loading

stefanholder Dec 19, 2016 •

edited

Loading