Sorting the data base on their angle relative to the center as in @JoeKington 's solution might have problems with some parts of the data:
In [1]:
import scipy.spatial as ss
import matplotlib.pyplot as plt
import numpy as np
import re
%matplotlib inline
In [2]:
data=np.array([[ 2.08937872e+001, 1.99020033e+001, 2.28260611e+001,
6.27711094e+000, 3.30392288e+000, 1.30312878e+001,
8.80768833e+000, 1.31238275e+001, 1.57400130e+001,
5.00278061e+000, 1.70752624e+001, 1.79131456e+001,
1.50746185e+001, 2.50095731e+001, 2.15895974e+001,
1.23237801e+001, 1.14860312e+001, 1.44268222e+001,
6.37680265e+000, 7.81485403e+000],
[ -1.19702178e-001, -1.14050879e-001, -1.29711421e-001,
8.32977493e-001, 7.27437322e-001, 8.94389885e-001,
8.65931116e-001, -6.08199292e-002, -8.51922900e-002,
1.12333841e-001, -9.88131292e-324, 4.94065646e-324,
-9.88131292e-324, 4.94065646e-324, 4.94065646e-324,
0.00000000e+000, 0.00000000e+000, 0.00000000e+000,
-4.94065646e-324, 0.00000000e+000]])
In [3]:
plt.plot(data[0], data[1])
plt.title('Unsorted Data')
Out[3]:
<matplotlib.text.Text at 0x10a5c0550>

See x values between 15 and 20 are not sorted correctly.
In [10]:
#Calculate the angle in degrees of [0, 360]
sort_index = np.angle(np.dot((data.T-data.mean(1)), np.array([1.0, 1.0j])))
sort_index = np.where(sort_index>0, sort_index, sort_index+360)
#sorted the data by angle and plot them
sort_index = sort_index.argsort()
plt.plot(data[0][sort_index], data[1][sort_index])
plt.title('Data Sorted by angle relatively to the centroid')
plt.plot(data[0], data[1], 'r+')
Out[10]:
[<matplotlib.lines.Line2D at 0x10b009e10>]

We can sort the data based on a nearest neighbor approach, but since the x and y are of very different scale, the choice of distance metrics becomes an important issue. We will just try all the distance metrics available in scipy to get an idea:
In [7]:
def sort_dots(metrics, ax, start):
dist_m = ss.distance.squareform(ss.distance.pdist(data.T, metrics))
total_points = data.shape[1]
points_index = set(range(total_points))
sorted_index = []
target = start
ax.plot(data[0, target], data[1, target], 'o', markersize=16)
points_index.discard(target)
while len(points_index)>0:
candidate = list(points_index)
nneigbour = candidate[dist_m[target, candidate].argmin()]
points_index.discard(nneigbour)
points_index.discard(target)
#print points_index, target, nneigbour
sorted_index.append(target)
target = nneigbour
sorted_index.append(target)
ax.plot(data[0][sorted_index], data[1][sorted_index])
ax.set_title(metrics)
In [6]:
dmetrics = re.findall('pdist\(X\,\s+\'(.*)\'', ss.distance.pdist.__doc__)
In [8]:
f, axes = plt.subplots(4, 6, figsize=(16,10), sharex=True, sharey=True)
axes = axes.ravel()
for metrics, ax in zip(dmetrics, axes):
try:
sort_dots(metrics, ax, 5)
except:
ax.set_title(metrics + '(unsuitable)')
It looks like standardized euclidean and mahanalobis metrics give the best result. Note that we choose a starting point of the 6th data (index 5), it is the data point this the largest y value (use argmax to get the index, of course).

In [9]:
f, axes = plt.subplots(4, 6, figsize=(16,10), sharex=True, sharey=True)
axes = axes.ravel()
for metrics, ax in zip(dmetrics, axes):
try:
sort_dots(metrics, ax, 13)
except:
ax.set_title(metrics + '(unsuitable)')
This is what happens if you choose the starting point of max. x value (index 13). It appears that mahanalobis metrics is better than standardized euclidean as it is not affected by the starting point we choose.

ais not what made the plot that you're showingyvalues, but post some sample data will help.