Machine Problem: More on searching

Overview

The goal for this assignment is to get more comfortable with the search functions explored in class, and to adapt them to the problem of searching a graph.

You will be writing a program that enables the user to search for a route between any two airports, given a maximum "hop distance" --- i.e., the maximum distance between any two airports on the route. A route should not contain a given airport more than once, and in addition may not always exist for a some hop distance. For airport information, we will be using the open source database maintained at OpenFlights.org.

On Graphs

Unlike trees, in which every two nodes are connected by precisely one path (formed by parent-child connections), graphs are data structures where nodes can be connected in entirely arbitrary ways. In a graph, nodes are called vertices, and connections between nodes are called edges.

This is often undesirable, however, as re-visiting a vertex during graph traversal might make for a sub-optimal route.

Consider the following graph, in which the vertices represent cities in the United States. The edges between the cities represent bi-directional connections, and the labels on the edges correspond to inter-city distances. Only cities with edges between them are considered neighbors, for the purposes of traversal.

It should be clear that there are potentially multiple routes between any two vertices in the graph. From Miami to Seattle, for instance, we may choose the [Miami, Chicago, Los Angeles, Seattle] route or the [Miami, New York, Chicago, Seattle] route, among others. Routes containing cycles are also possible; e.g., [Miami, Chicago, St. Louis, New York, Chicago, Seattle]. Your solution will need to find a route without a cycle.

Each route has a corresponding distance, which is simply the sum of the edges traversed. The [Miami, Chicago, Los Angeles, Seattle] route has a total distance of 1378 + 2015 + 1137 = 4530 miles, for instance.

Building a graph from airport data

Our airport dataset is a file that contains information for over 7100 airports located across the globe, where each row contains comma-separated fields describing a distinct airport. The most important fields to us are the airport name and the latitude and longitude, the latter of which we can use to compute distances between different airports.

Recall that our search function will be driven by the "maximum hop distance" between any two airports on the route --- the intuition being perhaps that we are seeking to travel between two cities using a plane with limited range. The cities reachable from any given city will therefore be obtained by filtering out all cities from the dataset which are more than the maximum hop distance away. This would form the basis of a `neighbors` function for our graph.

Data types and functions to compute the distance between two latitude/longitude pairs and represent routes are provided; here are some of them:

``````data Location = Location {
locName :: String,
latitude :: Double,
longitude :: Double
}

data Path = Path {
totalDist :: Double,
route :: [Location]
} deriving (Eq)

geoDistance :: (Double,Double) -> (Double,Double) -> Double
geoDistance (lat1,lon1) (lat2,lon2)
= 3959 * (acos (sindeg(lat1) * sindeg(lat2)
+ cosdeg(lat1) * cosdeg(lat2) * cosdeg(lon2-lon1)))
where sindeg = sin . toRadian
cosdeg = cos . toRadian
toRadian deg = pi/180.0 * deg

distanceBetween :: Location -> Location -> Double
distanceBetween (Location _ lat1 lon1)
(Location _ lat2 lon2) = geoDistance (lat1, lon1)
(lat2, lon2)
``````

Implementation details

You are to implement the function `findPath`, with the following type:

``````findPath :: [Location] -> String -> String -> Double -> Maybe Path
``````

The first argument is the list of locations (airports), the second and third are the names of the origin and destination aiports, and the fourth is the maximum hop distance. The function should return a `Just Path` if a route is found and `Nothing` otherwise.

We recommend, in addition to `findPath`, that you implement a more general purpose `graphSearch` function that follows the same general design as that of the `treeSearch` function presented in class. Your graph search function will, however, need to ensure that new nodes discovered have not already been visited, which will require a list of previously visited nodes to be passed along to each recursive call.

The locations list will be loaded using a comma-separated-value (CSV) library and passed to the `findPath` function. A function that does this, and prompts the user interactively for the three remaining arguments to be passed along is provided:

``````airportCSVFile = "data/airports.csv"

searchAirports :: IO ()
searchAirports = do
csv <- parseCSVFromFile airportCSVFile
case csv of
Left _ -> putStrLn "error loading airport data"
Right airports -> do
let cities = map (\rec -> Location (rec !! 1)
(read \$ rec !! 6)
(read \$ rec !! 7))
\$ filter ((==14) . length) airports
putStrLn "Enter origin airport:"
origin <- getLine
putStrLn "Enter destination airport:"
dest <- getLine
putStrLn "Enter hop distance:"
putStrLn \$ show \$ findPath cities origin dest hop
``````

See the provided source file to help you get started (using it is optional, though recommended). You should add an "airports" executable to your cabal file, as shown here, and you will also need to add the data file to your repository --- ideally in the "data" subfolder (which you'll need to create) --- so that the CSV library can locate it.

Testing

Included in the provided source file is a short list of locations --- available via the `cities` variable --- suitable for testing:

``````lat_long_data = [ ("Bangkok", 13.76, 100.5),
("Beijing", 39.90, 116.41),
("Buenos Aires", -34.6, -58.38),
("Chicago", 41.87, -87.63),
("Dallas", 32.77, -96.79),
("Denver", 39.73, -104.99),
("Dubai", 25.20, 55.27),
("Honolulu", 21.30, -157.85),
("London", 51.50, -0.12),
("Los-Angeles", 34.05, -118.24),
("New-York", 40.71, -74.00),
("Paris", 48.85, 2.35) ]

testAirports :: [Location]
testAirports = map (\(n,lat,lon)->Location n lat lon) lat_long_data
``````

Here are some sample routes (others are possible) discovered by a reference implementation of `findPath`:

``````> findPath testAirports "Los-Angeles" "New-York" 1000
Just <2459.3319233721513: Los-Angeles - Denver - Chicago - New-York >

> findPath testAirports "Los-Angeles" "New-York" 500
Nothing

findPath testAirports "Chicago" "Bangkok" 4000
Just <10385.014182737235: Chicago - London - Dubai - Bangkok >

findPath testAirports "Chicago" "Bangkok" 8000
Just <8637.711070622021: Chicago - Beijing - Bangkok >

findPath testAirports "Chicago" "Bangkok" 10000
Just <8558.90431700429: Chicago - Bangkok >

findPath testAirports "Chicago" "Bangkok" 1000
Nothing
``````

A `validatePath` function is also provided to help you ensure that all locations on your path are at most some distance apart, and that the total distance is accurate. Note that the function does not check that all locations in the path are unique.

``````> validatePath 1000 \$ fromJust \$ findPath cities "Los-Angeles" "New-York" 1000
True
``````

When you're ready to test your implementation with the full dataset, simply use the provided `searchAirports` function (you can also do `stack exec airports` if you correctly set up the cabal file). Here are some results your function should be able to come up with -- we'll allow for up to a 15-second search time, which should be plenty, even for a simple search algorithm:

• "Chicago O'Hare International Airport" to "John F Kennedy International Airport" with a hop distance of 500:

Just <775.2957853311073: Chicago O'Hare International Airport - John C. Munro Hamilton International Airport - John F Kennedy International Airport >

• "Chicago O'Hare International Airport" to "John F Kennedy International Airport" with a hop distance of 20:

Nothing

• "London Heathrow Airport" to "Charles de Gaulle International Airport" with a hop distance of 150:

Just <280.7740668318438: London Heathrow Airport - Koksijde Air Base - Charles de Gaulle International Airport >

• "Honolulu International Airport" to Narita International Airport" with a hop distance of 2500:

Just <3960.1095733170896: Honolulu International Airport - Henderson Field - Sendai Airport - Narita International Airport >

• "Honolulu International Airport" to Narita International Airport" with a hop distance of 500:

Nothing

If your program can find the routes above (or correct alternatives), you'll receive full credit.

For more of a challenge, you can attempt to add heuristics to your search implementation so that it can tackle cities that are farther apart, or shorter hop distances (which makes the number of potential routes explode in number). Here are some more challenging searches (again, with a 15-second limit):

• "London Heathrow Airport" to "Charles de Gaulle International Airport" with a hop distance of 50:

Just <226.00848752817393: London Heathrow Airport - London Biggin Hill Airport - Lydd Airport - Le Touquet-Côte d'Opale Airport - Abbeville - Paris Beauvais Tillé Airport - Charles de Gaulle International Airport >

• "Honolulu International Airport" to Narita International Airport" with a hop distance of 1500:

Just <4506.014981988908: Honolulu International Airport - Henderson Field - Wake Island Airfield - Minami Torishima Airport - Narita International Airport >

• "Chicago O'Hare International Airport" to "John F Kennedy International Airport" with a hop distance of 50:

Just <881.2512377878584: Chicago O'Hare International Airport - Chicago Meigs Airport - Michigan City Municipal Airport - Nappanee Municipal Airport - Smith Field - Van Wert County Airport - Lima Allen County Airport - Fostoria Metropolitan Airport - Griffing Sandusky Airport - Cleveland Hopkins International Airport - Cuyahoga County Airport - Youngstown Warren Regional Airport - Beaver County Airport - Allegheny County Airport - Arnold Palmer Regional Airport - Bedford County Airport - Hagerstown Regional Richard A Henson Field - Montgomery County Airpark - Martin State Airport - New Castle Airport - Northeast Philadelphia Airport - Trenton Mercer Airport - Linden Airport - John F Kennedy International Airport >

Submission

To submit your work, be sure your "`MP4.hs`" file is added to your repository, then commit all your changes and push your work to our shared BitBucket repository.