**73**results.

## Arbitrary Polygon to Triangle in O(n log n)

This is when you want to handle simple polygons like rectangles, pentagons, hexagons and so on. Here you just take a starting point and connect it to all other vertices. This algorithm is trivial and what I really wanted was a more general solution that could handle concave and polygons with holes.

While there are algorithms that run faster, the faster algorithms become very complicated. Kirkpatrick et al. found an algorithm to run in O(n log log n) and Chazelle did it in O(n). However, the easiest thing to implement is probably the Seidel algorithm that runs in O(n log n).

The algorithm is a 3-step process

- Decompose the polygon into trapezoids

- Decompose the trapezoids into monotone polygons

- Decompose the monotone polygons into triangles

If you're interested in C source it can be obtained from University of North Carolina at Chapel Hill. In general the code quality is good, handles holes, but will probably need to be massaged to your needs.

Please note that the Seidel algorithm code's license does forbid commercial use.

## algorithm - Simple 2d polygon triangulation - Stack Overflow

It is shown in this paper that the minimum distance between two finite planar sets if [sic] n points can be computed in O(n log n) worst-case running time and that this is optimal to within a constant factor. Furthermore, when the sets form a convex polygon this complexity can be reduced to O(n).

Let P = {p1, p2,..., pm} and Q = {q1, q2,..., qn} be two intersecting polygons whose vertices are specified by their cartesian coordinates in order. An optimal O(m + n) algorithm is presented for computing the minimum euclidean distance between a vertex pi in P and a vertex qj in Q.

## geometry - What is the fastest algorithm to calculate the minimum dist...

It is shown in this paper that the minimum distance between two finite planar sets if [sic] n points can be computed in O(n log n) worst-case running time and that this is optimal to within a constant factor. Furthermore, when the sets form a convex polygon this complexity can be reduced to O(n).

Let P = {p1, p2,..., pm} and Q = {q1, q2,..., qn} be two intersecting polygons whose vertices are specified by their cartesian coordinates in order. An optimal O(m + n) algorithm is presented for computing the minimum euclidean distance between a vertex pi in P and a vertex qj in Q.

## geometry - What is the fastest algorithm to calculate the minimum dist...

Put your polygons into some kind of spatial lookup structure, for example an R-tree based on the minimum bounding rectangles for the polygons. Then you should be able to compute the containment relation that you're after in O(n log n). (Unless you're in a pathological case where many of the minimum bounding rectangles for your polygons overlap, which seems unlikely based on your description.)

Edited to add: of course, you don't rely on the R-tree to tell you if one polygon is inside another (it's based on minimum bounding rectangles so it can only give you an approximation). You use the R-tree to cheaply identify candidate inclusions which you then verify in the expensive way (checking that a point in one polygon is inside the other).

This is a very simplistic approach that doesn't need pathological cases at all to fail. Suppose there is a large polygon shaped like a "C", if we put a square in the free empty area of "C", why would you say that this square is contained by the other polygon ? Or maybe this is the pathological case and I'm adding unneeded requirements to a question that isn't mine.

Thanks for the answer but I don't think it will work. The problem is the time needed to build the lookup structure as it wont be reused again (i.e. the polygons may be different every time). Also, bboxes may or may not overlap, that's 50% and that's a lot :).

It costs O(n log n) to build the R-tree, so I don't think that's a concern.

@EcirHana this will work, building up an spatial index can be extremly fast, i recommend using an PMR bucket Rectangle quadtree as spatial index.

## algorithm - Polygon inside polygon inside polygon - Stack Overflow

Put your polygons into some kind of spatial lookup structure, for example an R-tree based on the minimum bounding rectangles for the polygons. Then you should be able to compute the containment relation that you're after in O(n log n). (Unless you're in a pathological case where many of the minimum bounding rectangles for your polygons overlap, which seems unlikely based on your description.)

Edited to add: of course, you don't rely on the R-tree to tell you if one polygon is inside another (it's based on minimum bounding rectangles so it can only give you an approximation). You use the R-tree to cheaply identify candidate inclusions which you then verify in the expensive way (checking that a point in one polygon is inside the other).

This is a very simplistic approach that doesn't need pathological cases at all to fail. Suppose there is a large polygon shaped like a "C", if we put a square in the free empty area of "C", why would you say that this square is contained by the other polygon ? Or maybe this is the pathological case and I'm adding unneeded requirements to a question that isn't mine.

Thanks for the answer but I don't think it will work. The problem is the time needed to build the lookup structure as it wont be reused again (i.e. the polygons may be different every time). Also, bboxes may or may not overlap, that's 50% and that's a lot :).

It costs O(n log n) to build the R-tree, so I don't think that's a concern.

@EcirHana this will work, building up an spatial index can be extremly fast, i recommend using an PMR bucket Rectangle quadtree as spatial index.

## algorithm - Polygon inside polygon inside polygon - Stack Overflow

An O(n log X) algorithm, where X depends on the precision you want.

Binary search for largest radius R for a circle:

At each iteration, for a given radius r, push each edge E, "inward" by R, to get E'. For each edge E', define half-plane H as the set of all points "inside" the the polygon (using E' as the boundary). Now, compute the intersection of all these half-planes E', which could be done in O(n) time. If the intersection is non-empty, then if you draw a circle with radius r using any point in the intersection as the center, it will be inside the given polygon.

## algorithm - Largest circle inside a non-convex polygon - Stack Overflo...

An O(n log X) algorithm, where X depends on the precision you want.

Binary search for largest radius R for a circle:

At each iteration, for a given radius r, push each edge E, "inward" by R, to get E'. For each edge E', define half-plane H as the set of all points "inside" the the polygon (using E' as the boundary). Now, compute the intersection of all these half-planes E', which could be done in O(n) time. If the intersection is non-empty, then if you draw a circle with radius r using any point in the intersection as the center, it will be inside the given polygon.

## algorithm - Largest circle inside a non-convex polygon - Stack Overflo...

If you want to get non-overlapping polygons, you can run a line intersection algorithm. A simple variant is the BentleyOttmann algorithm, but even faster algorithms of O(n log n + k) (with n vertices and k crossings) are possible.

Given a line intersection, you can unify two polygons by constructing a vertex connecting both polygons on the intersection point. Then you follow the vertices of one of the polygons inside of the other polygon (you can determine the direction you have to go in using your point-in-polygon function), and remove all vertices and edges until you reach the outside of the polygon. There you repair the polygon by creating a new vertex on the second intersection of the two polygons.

Unless I'm mistaken, this can run in O(n log n + k * p) time where p is the maximum overlap of the polygons.

After unification of the polygons you can use an ordinary area function to calculate the exact area of the polygons.

## objective c - Calculating total coverage area of a union of polygons -...

I'm a fan of sorting the points based on the angle that the segments make on either side of the point, then removing the points with the shallowest angle iteratively until you hit some threshold. O(n log(n)) I think, versus the RDP method's O(n^2), and with smaller constants to boot. :)

The angle can even be scaled by the segment lengths (indeed it's easier to compute it this way) if you want to give more weight (desirability) to longer segments.

Given p0, p1, p2, p1's weight would be ((p0 - p1) dot (p2 - p1)), normalize the differences if you don't want to weight by length. (Contrast this to distance-to-line, it is much cheaper, and the results may be identical.)

Actually- this could even be a O(n) solution, if you just march front to back and remove points that don't meet your angle criteria; just look one or two ahead of where you might be removing points to make sure that better-to-remove points aren't coming up.

I see a weak spot in this algorithm: if you have a rectangle with a very small but bushy "star" at one of the corners (that has sharp corners), this algorithm will wipe out the rectangle first.

You may need to tweak your weighting, e.g. so that a vertex with two "long" edges would be seen as less desirable to remove. The single-pass take on this algorithm would be less useful in cases like this. However, my last paragraph in the original post should nudge in a decent direction.

## algorithm - Reduce number of points in line - Stack Overflow

Convex is fairly easy to test for: Consider each set of three points along the polygon. If every angle is 180 degrees or less you have a convex polygon. This test runs in O(n) time.

Note, also, that in most cases this calculation is something you can do once and save--most of the time you have a set of polygons to work with that don't go changing all the time.

Edit: Peter Kirkham pointed out a polygon that this test won't catch. It's easy enough to catch his polygon, also: When you figure out each angle, also keep a running total of (180 - angle). For a convex polygon this will total 360.

Thats sounds like a good approach to the problem, but it still leave the problem of complex polygons

You can successfully use this algorithm for every case. Polygons whose sides intersect will fail this angle test.

I can't try this solution till I get back to work. But it looks promissing and it should work.

For a convex polygon the total of interior angles in (n-2)*180 regentsprep.org/Regents/math/poly/LPoly1.htm which is another way of looking at it

## algorithm - How do determine if a polygon is complex/convex/nonconvex?...

Convex is fairly easy to test for: Consider each set of three points along the polygon. If every angle is 180 degrees or less you have a convex polygon. This test runs in O(n) time.

Note, also, that in most cases this calculation is something you can do once and save--most of the time you have a set of polygons to work with that don't go changing all the time.

Edit: Peter Kirkham pointed out a polygon that this test won't catch. It's easy enough to catch his polygon, also: When you figure out each angle, also keep a running total of (180 - angle). For a convex polygon this will total 360.

Thats sounds like a good approach to the problem, but it still leave the problem of complex polygons

You can successfully use this algorithm for every case. Polygons whose sides intersect will fail this angle test.

I can't try this solution till I get back to work. But it looks promissing and it should work.

For a convex polygon the total of interior angles in (n-2)*180 regentsprep.org/Regents/math/poly/LPoly1.htm which is another way of looking at it

## algorithm - How do determine if a polygon is complex/convex/nonconvex?...

If you have a representation of the US as a polygon you could then use a 'point-in-polygon' algorithm, such as a crossing number test, to test whether the point lies within the polygon or not. This type of query runs in O(n) time, for a polygon with n edges.

If you want something faster, but approximate, you could do an (offline) spatial decomposition of your polygon, via something like a quadtree and determine which leaf boxes in the tree lie within the borders. The average (online) runtime to find the enclosing leaf box for a point would then be O(log(n)) for a tree with n boxes.

## algorithm - Determine if a given lat/long is within the borders of the...

It may also be worth considering that the complexity of many algorithms is based on more than one variable, particularly in multi-dimensional problems. For example, I recently had to write an algorithm for the following. Given a set of n points, and m polygons, extract all the points that lie in any of the polygons. The complexity is based around two known variables, n and m, and the unknown of how many points are in each polygon. The big O notation here is quite a bit more involved than O(f(n)) or even O(f(n) + g(m)). Big O is good when you are dealing with large numbers of homogenous items, but don't expect this to always be the case.

It is also worth noting that the actual number of iterations over the data is often dependent on the data. Quicksort is usually quick, but give it presorted data and it slows down. My points and polygons alogorithm ended up quite fast, close to O(n + (m log(m)), based on prior knowledge of how the data was likely to be organised and the relative sizes of n and m. It would fall down badly on randomly organised data of different relative sizes.

A final thing to consider is that there is often a direct trade off between the speed of an algorithm and the amount of space it uses. Pigeon hole sorting is a pretty good example of this. Going back to my points and polygons, lets say that all my polygons were simple and quick to draw, and I could draw them filled on screen, say in blue, in a fixed amount of time each. So if I draw my m polygons on a black screen it would take O(m) time. To check if any of my n points was in a polygon, I simply check whether the pixel at that point is green or black. So the check is O(n), and the total analysis is O(m + n). Downside of course is that I need near infinite storage if I'm dealing with real world coordinates to millimeter accuracy.... ...ho hum.

## optimization - What is Big O notation? Do you use it? - Stack Overflow

## Explanation of the Inner Loop logic:

Your current Brute-Force algorithm is O(n^2). For just your two 70,000 line polygons that's some factor of almost 10 Billion operations, to say nothing of the 700,000 other polygons. Obviously, no amount of mere code optimization is going to be enough, so you need some kind algorithmic optimization that can lower that O(n^2) without being unduly complicated.

The O(n^2) comes from the nested loops in the brute-force algorithm that are each bounded by n, making it O(n*n). The simplest way to improve this would be to find some way to reduce the inner loop so that it is not bound by or dependent on n. So what we need to find is some way to order or re-order the inner list of lines to check the outer line against so that only a part of the full list needs to be scanned.

The approach that I am taking takes advantage of the fact that if two line segments intersect, then the range of their X values must overlap each other. Mind you, this doesn't mean that they do intersect, but if their X ranges don't overlap, then they cannot be intersecting so theres no need to check them against each other. (this is true of the Y value ranges also, but you can only leverage one dimension at a time).

The advantage of this approach is that these X ranges can be used to order the endpoints of the lines which can in turn be used as the starting and stopping points for which lines to check against for intersection.

So specifically what we do is to define a custom class (endpointEntry) that represents the High or Low X values of the line's two points. These endpoints are all put into the same List structure and then sorted based on their X values.

Then we implement an outer loop where we scan the entire list just as in the brute-force algorithm. However our inner loop is considerably different. Instead of re-scanning the entire list for lines to check for intersection, we rather start scanning down the sorted endpoint list from the high X value endpoint of the outer loop's line and end it when we pass below that same line's low X value endpoint. In this way, we only check the lines whose range of X values overlap the outer loop's line.

class CheckPolygon2 { // internal supporting classes class endpointEntry { public double XValue; public bool isHi; public Line2D line; public double hi; public double lo; public endpointEntry fLink; public endpointEntry bLink; } class endpointSorter : IComparer<endpointEntry> { public int Compare(endpointEntry c1, endpointEntry c2) { // sort values on XValue, descending if (c1.XValue > c2.XValue) { return -1; } else if (c1.XValue < c2.XValue) { return 1; } else // must be equal, make sure hi's sort before lo's if (c1.isHi && !c2.isHi) { return -1; } else if (!c1.isHi && c2.isHi) { return 1; } else { return 0; } } } public bool CheckForCrossing(List<Line2D> lines) { List<endpointEntry> pts = new List<endpointEntry>(2 * lines.Count); // Make endpoint objects from the lines so that we can sort all of the // lines endpoints. foreach (Line2D lin in lines) { // make the endpoint objects for this line endpointEntry hi, lo; if (lin.P1.X < lin.P2.X) { hi = new endpointEntry() { XValue = lin.P2.X, isHi = true, line = lin, hi = lin.P2.X, lo = lin.P1.X }; lo = new endpointEntry() { XValue = lin.P1.X, isHi = false, line = lin, hi = lin.P1.X, lo = lin.P2.X }; } else { hi = new endpointEntry() { XValue = lin.P1.X, isHi = true, line = lin, hi = lin.P1.X, lo = lin.P2.X }; lo = new endpointEntry() { XValue = lin.P2.X, isHi = false, line = lin, hi = lin.P2.X, lo = lin.P1.X }; } // add them to the sort-list pts.Add(hi); pts.Add(lo); } // sort the list pts.Sort(new endpointSorter()); // sort the endpoint forward and backward links endpointEntry prev = null; foreach (endpointEntry pt in pts) { if (prev != null) { pt.bLink = prev; prev.fLink = pt; } prev = pt; } // NOW, we are ready to look for intersecting lines foreach (endpointEntry pt in pts) { // for every Hi endpoint ... if (pt.isHi) { // check every other line whose X-range is either wholly // contained within our own, or that overlaps the high // part of ours. The other two cases of overlap (overlaps // our low end, or wholly contains us) is covered by hi // points above that scan down to check us. // scan down for each lo-endpoint below us checking each's // line for intersection until we pass our own lo-X value for (endpointEntry pLo = pt.fLink; (pLo != null) && (pLo.XValue >= pt.lo); pLo = pLo.fLink) { // is this a lo-endpoint? if (!pLo.isHi) { // check its line for intersection if (pt.line.intersectsLine(pLo.line)) return true; } } } } return false; } }

I am not certain what the true execution complexity of this algorithm is, but I suspect that for most non-pathological polygons it will be close to O(n*SQRT(n)) which should be fast enough.

The Inner Loop simply scans the endPoints list in the same sorted order as the outer loop. But it will start scanning from where the outer loop from where the outer loop currently is in the list (which is the hiX point of some line), and will only scan until the endPoint values drop below the corresponding loX value of that same line.

What's tricky here, is that this cannot be done with an Enumerator (the foreach(..in pts) of the outer loop) because there's no way to enumerate a sublist of a list, nor to start the enumeration based on another enumerations current position. So instead what I did was to use the Forward and Backward Links (fLink and bLink) properties to make a doubly-linked list structure that retains the sorted order of the list, but that I can incrementally scan without enumerating the list:

for (endpointEntry pLo = pt.fLink; (pLo != null) && (pLo.XValue >= pt.lo); pLo = pLo.fLink)

Breaking this down, the old-style for loop specifier has three parts: initialization, condition, and increment-decrement. The initialization expression, endpointEntry pLo = pt.fLink; initializes pLo with the forward Link of the current point in the list. That is, the next point in the list, in descending sorted order.

Then the body of the inner loop gets executed. Then the increment-decrement pLo = pLo.fLink gets applied, which simply sets the inner loop's current point (pLo) to the next lower point using it's forward-link (pLo.fLink), thus advancing the loop.

Finally, it loops after testing the loop condition (pLo != null) && (pLo.XValue >= pt.lo) which loops so long as the new point isn't null (which would mean that we were at the end of the list) and so long as the new point's XValue is still greater than or equal to the low X value of the outer loop's current point. This second condition insures that the inner loop only looks at lines that are overlapping the line of the outer loop's endpoint.

What is clearer to me now, is that I probably could have gotten around this whole fLink-bLink clumsiness by treating the endPoints List as an array instead:

- Fill up the list with endPoints

- Sort the List by XValue

- Outer Loop through the list in descending order, using an index iterator (i), instead of a foreach enumerator

- (A) Inner Loop through the list, using a second iterator (j), starting at i and ending when it passed below pt.Lo.

That I think would be much simpler. I can post a simplified version like that, if you want.

It takes a total of 15 minutes to run all 700,000 polygons! That's extremely good. I just need to check if it doesn't miss out on any.

@EvanParsons How did the testing go? Did this algorithm work out for you?

I'm 99% sure it's doing what I want. The only issue is that I don't understand it completely yet (although I have a fairly good idea what it does), I'm going to dedicate some time tomorrow and make sure I understand it. I made some adjustments to my lineSegment intersection method and a few other areas and got the time down to 8:47. This is basically as fast as my original implementation of the Hoey-Shamos algorithm. Thanks a bunch!

Okay, so I think I generally understand what's going on. But what's going on in the inner loop? I've never seen a for loop like that before.

## c# - Implementing a brute force algorithm for detecting a self-interse...

Originally, I came up with a O(N^4) algorithm, but noticed was way over-complicating it and brought it down to O(N^2) algorithm. It seems like there maybe a bug in the code somewhere that causes issues in at least the case of a vertical line of points. It might be a special case (requiring changing a few lines of code), or a sign of a larger flaw in the algorithm. If it's the latter, then I'm inclined to say it's NP-hard, and offer the algorithm as a heuristic. I've spent all the time I care to coding and debugging it. In any case it seems to work fine in other cases. However, it's difficult to test for correctness when the correct algorithms seem to be O(2^N).

def maximal_convex_hull(points): # Expects a list of 2D points in the format (x,y) points = sorted(set(points)) # Remove duplicates and sort if len(points) <= 1: # End early return points def cross(o, a, b): # Cross product return (a[0] - o[0]) * (b[1] - o[1]) - (a[1] - o[1]) * (b[0] - o[0]) # Use a queue to extend Andrew's algorithm in the following ways: # 1. Start a new convex hull for each successive point # 2. Keep track of the largest convex hull along the way Q = [] size = 0 maximal_hull = None for i,p in enumerate(points): remaining = len(points) - i + 1 new_Q = [] for lower, upper in Q: # Go through the queue # Build upper and lower hull similtanously, slightly less efficent while len(lower) >= 2 and cross(lower[-2], lower[-1], p) < 0: lower.pop() lower.append(p) while len(upper) >= 2 and cross(upper[-2], upper[-1], p) > 0: upper.pop() upper.append(p) new_size = len(lower) + len(upper) if new_size > size: # Check for a new highest maximal convex hull size = new_size maximal_hull = (lower, upper) # There is an odd bug? that when one or both if statements are removed # xqg237.tsp (TSPLIB) gives slightly different solutions and is # missing a number of points it should have. # Seems like if the opposite should be true if anything since they # were intended to be easy optimizations not required code. if remaining + new_size >= size: # Don't bother if not enough new_Q.append((lower, upper)) # Add an updated convex hulls if remaining > size: # Don't bother if not enough new_Q.append(([p], [p])) # Add a new convex hull Q = new_Q # Update to the new queue # Extract and merge the lower and upper to a maximium convex hull # Only one of the last points is ommited because it is repeated # Asserts and related variables used for testing # Could just have "return lower[:-1] + list(reversed(upper[:-1]))[:-1]" lower, upper = maximal_hull max_hull_set = set(lower) | set(lower) | set(upper) lower = lower upper = list(reversed(upper[:-1]))[:-1] max_convex_hull = lower + upper assert len(lower) + len(upper) == len(max_hull_set) assert max_hull_set == set(max_convex_hull) return max_convex_hull

Edit: This algorithm isn't correct, but it served as inspiration for the correct algorithm and thus I'm leaving it here.

Thanks! i for the idea, by extending Andrew's algorithm, i get a O(N^4) algorithm using Dynamic Programming that i think doesn't have any flaw, I'm still have some doubt in your O(N^2) but i will check it out again in the weekend :)

Andrew's algorithm is O(N) (but requires a sort, making it O(N log N)), this algorithm runs O(N) versions of andrew's algorithm. Making for O(N^2) runtime. Although, the algorithm may not go far enough.

Yes, but i have some doubt in correctness of that algorithm, can you explain a little how the algorithm work since i'm not really familiar with phython

like in this picture imgur.com/DsJhF71, wasn't EFG pop-ed when C inserted, although the optimal upper hull is AEFGD

What's the point set for that example? You might have a point that it should be considering the upper and lower hulls separately. Although that would still result in a O(N^2) algorithm, just with a larger hidden coefficients.

## algorithm - Finding largest subset of points forming a convex polygon ...

This is my a Dynamic Programming O(N^4) algorithm with idea from Andrew's Algorithm posted by Nuclearman, i'm still looking for a O(N^3) algorithm

upper_hull(most left point, previous point, current point) { ret = 0 if (current point != most left point) /* at anytime we can decide to end the upper hull and build lower_hull from current point ending at most left point */ ret = min(ret,lower_hull(most left point, -1, current point)) for all point on the right of current point /* here we try all possible next point that still satisfy the condition of convex polygon */ if (cross(previous point,current point,next point) >= 0) max(ret,1+upper_hull(most left point, current point, next point)) return ret; } lower_hull(most left point, previous point, current point) { if (current point == most left point) return 0; ret = -INF /* it might be impossible to build a convex hull here, so if it does it will return -infinity */ for all point on the left of current point and not on the left of most left point if (cross(previous point,current point,next point) >= 0) max(ret,1+lower_hull(most left point, current point, next point)) return ret; }

First sort the point based on x axis then for tie sort by y axis, then try all point as most left point to run the upper_hull(p,-1,p) , please tell me if there's any flaw in this algorithm

One possibility is to precalculate all of the cross-product results (O(N^3)) and break them into two graphs based on if the result is positive or negative, then use depth first search starting on the left most point to find the longest of the shortest paths. It seems like it should be doable in O(E), which since edge is a triple (a,b),(b,c), is O(N^3). However, you then have to do this for O(N) points (for each left-most point), results in an overall time complexity of O(N^4). Therefore, I'm fairly sure now that you can't get better than O(N^4).

Thanks for the bounty, glad I could be of help.

## algorithm - Finding largest subset of points forming a convex polygon ...

An O(n log X) algorithm, where X depends on the precision you want.

Binary search for largest radius R for a circle:

At each iteration, for a given radius r, push each edge E, "inward" by R, to get E'. For each edge E', define half-plane H as the set of all points "inside" the the polygon (using E' as the boundary). Now, compute the intersection of all these half-planes E', which could be done in O(n) time. If the intersection is non-empty, then if you draw a circle with radius r using any point in the intersection as the center, it will be inside the given polygon.

## algorithm - Largest circle inside a non-convex polygon - Stack Overflo...

The general problem is similar to the minimum bounding problem, which finds the "oriented minimum bounding box enclosing a set of points". If the points form a convex polygon, an O(n) "algorithm for the minimum-area enclosing rectangle is known". This could be adapted to find the minimum square to cover a set of points.

Your problem would be more difficult since you're dealing with x squares, but it's simplified by the requirement that the squares are parallel to the axes. You need to work out an algorithm to divide the points into x areas so that each area will be able to be covered by the same size square. Let's say X is 4. In such a case, you would look for a line that divides the x-points into 2 equal areas and a line that divides the y-points into 2-equal areas and adjust from there.

## algorithm - Minimum length of squares to cover k points - Stack Overfl...

It may also be worth considering that the complexity of many algorithms is based on more than one variable, particularly in multi-dimensional problems. For example, I recently had to write an algorithm for the following. Given a set of n points, and m polygons, extract all the points that lie in any of the polygons. The complexity is based around two known variables, n and m, and the unknown of how many points are in each polygon. The big O notation here is quite a bit more involved than O(f(n)) or even O(f(n) + g(m)). Big O is good when you are dealing with large numbers of homogenous items, but don't expect this to always be the case.

It is also worth noting that the actual number of iterations over the data is often dependent on the data. Quicksort is usually quick, but give it presorted data and it slows down. My points and polygons alogorithm ended up quite fast, close to O(n + (m log(m)), based on prior knowledge of how the data was likely to be organised and the relative sizes of n and m. It would fall down badly on randomly organised data of different relative sizes.

A final thing to consider is that there is often a direct trade off between the speed of an algorithm and the amount of space it uses. Pigeon hole sorting is a pretty good example of this. Going back to my points and polygons, lets say that all my polygons were simple and quick to draw, and I could draw them filled on screen, say in blue, in a fixed amount of time each. So if I draw my m polygons on a black screen it would take O(m) time. To check if any of my n points was in a polygon, I simply check whether the pixel at that point is green or black. So the check is O(n), and the total analysis is O(m + n). Downside of course is that I need near infinite storage if I'm dealing with real world coordinates to millimeter accuracy.... ...ho hum.

## optimization - What is Big O notation? Do you use it? - Stack Overflow

Here is a solution which uses O(n log n) time for preprocessing and O((log n)^2 + cnt) per query, where cnt is a number of intersections. It works for any polygon.

1)Preprocessing: Store each segment as a pair(low_y, high_y). Sort them by low_y. Now it is possible to build a two dimensional segment tree where the first dimension is low_y and the second dimension is high_y. It can take O(n log n) space and time if done properly(one can keep a sorted vectorof high_y values for each segment tree node which contains those and only those high_y values which correspond to this particular node).

2)Query: It can rephrased in the following way: find all such segments(that is, pairs) which satisfy low_y <= query_y <= high_y condition. To find all such segments, one can traverse the segment tree and decompose a range [min(low_y), query_y] into a union of at most O(log n) nodes(here only the first dimension is considered). For a fixed node, one can apply a binary search over the sorted high_y vector to extract only those segments which satisfy low_y <= query_y <= high_y condition(the first inequality is true because of the way the tree is traversed, so we need to check high_y only). Here we have O(log n) nodes(due to the properties of a segment tree) and a binary search takes O(log n) time. So this step has O((log n)^2 time complexity. After the smallest high_y is found with binary search, it is clear that the tail of the vector(from this position to the end) contains those and only those segments which do intersect with the query line. So one can simply iterate over them and find the intersection points. This step takes O(cnt) time because a segment is checked if and only if it intersects with the line(cnt - total numner of intersections between the line and the polygon). Thus, the entire query has O((log n)^2 + cnt) time complexity.

3)There are actually at least two corner cases here: i)a point of intersection is a common point of two adjacent polygon sections and ii)a horizontal section, so they should be handled carefully depending on what is the desired output for them(for example, one can ignore horizontal edges completely or assume that a whole edge is an intersection).

I'm confused... All you need is a plain ol' segment tree. (1) Construct it in O(nlog n) time. (2) Query it in O(log n + k) time per query, where k is the number of intervals that contain the query y co-ord.

Each segment has two dimensional nature. It is not possible to unify low_y and high_y. Or, at least I do not see a way to unify them correctly.

We don't care about x co-ords at all though -- for each edge in the polygon, we can just store its y extent (pair of y co-ords) as a segment in the segment tree, with the x co-ords stored in 2 extra fields (i.e. fields that the segment tree doesn't care about).

For one dimensional segment tree, it is necessary to maintain a list or a vector of edges for each node(because there can be more than one edge covering concrete y). It is O(n^2) space in the worst case(if all edges are long and cover almost the entire range).

No, every segment (here, every vertical extent of a polygon edge) will appear at exactly 1 node in the segment tree. (I had an explanation here of where that is, but it was wrong -- I trust the Wikipedia article though :) ) Here, its x co-ords can be stored in the same place (and treated as a black box by the segment tree functions).