I'm trying to find an intelligent solution to how text / annotations are placed into a matplotlib plt so they don't over lap with the data point being annotated. Code snip below. Apologies for long dict at the top. So far I've found adjustText which looks very promising, but I can't seem to get it working in this instance. The code below uses adjust_text(), but at the moment all text is being placed together in one part of the ax and I don't understand why. If you run without adjust_text() it places text roughly where it should be, but text is overlapping the data point in places, which I want to avoid. Grateful for any help.
fig, ax = plt.subplots(figsize=(10, 8))
dl_data = {
"Center": {
"axis": (0, 0),
"tp": (0, 0),
"r": 21.37311395187889,
"colour": "#ffffff",
"text": "Center",
"fill": "solid",
"ec": "#808080",
"alignment": ("center", "center"),
},
"First": {
"r": 6.758772077825972,
"wlc": 45.681000000000004,
"text": "First",
"colour": "#FFFFFF",
"fill": "dashed",
"ec": "#808080",
"alignment": ("center", "center"),
"axis": (-68.82111180215705, -1.2642233142341064e-14),
"tp": (-68.82111180215705, -1.2642233142341064e-14),
},
"Second": {
"r": 18.979199140111263,
"wlc": 360.21000000000004,
"text": "Second",
"colour": "#FFFFFF",
"fill": "dashed",
"ec": "#808080",
"alignment": ("center", "center"),
"axis": (-34.41055590107855, 59.600831137357034),
"tp": (-34.41055590107855, 59.600831137357034),
},
"P1": {
"r": 4.779173568725037,
"wlc": 2.6,
"colour": "#92a700",
"text": "P1, £3",
"fill": "solid",
"ec": "#92a700",
"axis": (-80.83697480558055, -1.4849511367261418e-14),
"alignment": ("right", "top"),
"tp": (-87.6161483743056, -1.6094825349031936e-14),
},
"P2": {
"r": 4.779173568725037,
"wlc": 0,
"colour": "#ffba00",
"text": "P2 has a long\nName, £0\n\n",
"fill": "solid",
"ec": "#ffba00",
"axis": (-13.031791598544089, 30.17548646933409),
"alignment": ("left", "top"),
"tp": (-9.047093352116576, 24.691019844418072),
},
"P3": {
"r": 4.779173568725037,
"wlc": 0.21,
"colour": "#92a700",
"text": "P3 has a very,\nlong long long,\nname, £0 \n",
"fill": "solid",
"ec": "#92a700",
"axis": (-55.78932020361301, 30.175486469334082),
"alignment": ("right", "top"),
"tp": (-59.77401845004052, 24.691019844418065),
},
"P4": {
"r": 15.811388300841896,
"wlc": 250,
"colour": "#e77200",
"text": "P4 also\nhas a longish\nname, £250\n",
"fill": "solid",
"ec": "#e77200",
"axis": (-34.41055590107855, 95.97255740839438),
"alignment": ("center", "center"),
"tp": (-34.41055590107855, 113.78394570923628),
},
"P5": {
"r": 4.779173568725037,
"wlc": 6.6,
"colour": "#92a700",
"text": "P5 is medium,\n£7\n\n",
"fill": "solid",
"ec": "#92a700",
"axis": (-69.00212318005225, 70.8403126698613),
"alignment": ("right", "top"),
"tp": (-75.44950037768407, 72.9351925104148),
},
"P6": {
"r": 10.16857905510893,
"wlc": 103.4,
"colour": "#92a700",
"text": "P6 is a very long name\nlike P4 is also,\n£100\n",
"fill": "solid",
"ec": "#92a700",
"axis": (0.181011377895139, 70.8403126698613),
"alignment": ("left", "top"),
"tp": (11.754017782309209, 74.600610395285),
},
}
ts = []
x_list = []
y_list = []
for c in dl_data.keys():
circle = plt.Circle(
dl_data[c]["axis"], # x, y position
radius=dl_data[c]["r"],
fc=dl_data[c]["colour"], # face colour
ec=dl_data[c]["ec"], # edge colour
zorder=2,
)
ax.add_patch(circle)
x = dl_data[c]["axis"][0]
y = dl_data[c]["axis"][1]
text = dl_data[c]["text"]
if c in ["Center", "First", "Second"]:
pass
else:
ts.append(ax.text(x, y, dl_data[c]["text"]))
x_list.append(x)
y_list.append(y)
adjust_text(
ts,
x=x_list,
y=y_list,
force_points=0.1,
arrowprops=dict(arrowstyle="->", color="red"),
)
plt.axis("scaled")
plt.axis("off")
plt.show()
There are two issues:
adjust_text must called after all drawing is completed, i.e. plt.axis("scaled") must come before adjust_text, see docs:
Call adjust_text the very last, after all plotting (especially
anything that can change the axes limits) has been done.
You must pass your circles as additional objects to be avoided: add_objects=objects
ts = []
x_list = []
y_list = []
objects = []
for c in dl_data.keys():
circle = plt.Circle(
dl_data[c]["axis"], # x, y position
radius=dl_data[c]["r"],
fc=dl_data[c]["colour"], # face colour
ec=dl_data[c]["ec"], # edge colour
zorder=2,
)
objects.append(circle)
ax.add_patch(circle)
x = dl_data[c]["axis"][0]
y = dl_data[c]["axis"][1]
text = dl_data[c]["text"]
if c in ["Center", "First", "Second"]:
pass
else:
ts.append(ax.text(x, y, dl_data[c]["text"].strip()))
x_list.append(x)
y_list.append(y)
plt.axis("scaled")
plt.axis("off")
adjust_text(
ts,
add_objects=objects,
arrowprops=dict(arrowstyle="->", color="red"),
)
I couldn't manage to move the P6 text away from the green and orange circles, though.
I want to import some weather data (temperature, wind speed, ...) that is all formatted in a JSON file into a PostgreSQL 11 table so I can then make queries on that data.
I've been able to manually insert some data into a table but that's only OK because it's a small amount of data and I'm planning on using a LOT more data afterwards. Here is what I've found using the INSERT function: https://datavirtuality.com/blog-json-in-postgresql/.
That's why I've been trying to use the COPY function but no luck so far, even after having read a lot of stuff on different sources on the Internet ...
The JSON file is downloadable there : https://queueresults.meteoblue.com/F2637B90-45BB-4E7A-B47C-C34CD56674B3 (let me know if the file doesn't exist anymore).
I've been able to import the JSON file as text into a table with:
create table temp_json (values text);
copy temp_json from '/home/cae/test.json';
But I don't think that's the best approach to be able to make efficient queries later on ...
I usually run into the following error during my tests:
ERROR: invalid input syntax for type json
DETAIL: The input string ended unexpectedly.
CONTEXT: JSON data, line 1: [
as if I'm not able to parse the JSON file and the array properly within PostgreSQL ...
Thanks for your help !
Edit: Here is the content of the JSON file:
[
{
"geometry": {
"type": "MultiPoint",
"locationNames": [
"59.4°N/24.7°E31.7m",
"59.4°N/24.8°E36.4m"
],
"coordinates": [
[
24.7,
59.4,
31.73
],
[
24.8,
59.4,
36.445
]
]
},
"domain": "NEMS12",
"codes": [
{
"unit": "°C",
"dataPerTimeInterval": [
{
"data": [
[
-0.395,
-0.195,
-0.099999994,
-0.030000001,
-0.060000002,
-0.099999994,
-0.099999994,
0.005,
-0.055,
0.19,
0.48,
0.725,
1.88,
1.88,
1.855,
1.935,
2.1950002,
2.595,
3.3049998,
4.115,
3.37,
2.97,
3.32,
3.5149999,
3.56,
3.44,
3.355,
3.3600001,
3.32,
3.32,
3.4250002,
3.42,
3.3899999,
3.445,
3.3200002,
3.0549998,
4.58,
4.01,
3.02,
2.79,
2.75,
2.76,
2.855,
2.99,
2.96,
2.775,
2.595,
2.4250002
],
[
-0.49,
-0.26,
-0.16,
-0.09,
-0.1,
-0.13,
-0.12,
0.01,
-0.07,
0.17,
0.44,
0.66,
1.84,
1.85,
1.83,
1.9,
2.15,
2.55,
3.27,
4.11,
3.46,
2.96,
3.31,
3.5,
3.55,
3.42,
3.33,
3.34,
3.29,
3.29,
3.43,
3.44,
3.42,
3.52,
3.41,
3.11,
4.53,
4,
3.01,
2.79,
2.76,
2.77,
2.87,
3,
2.93,
2.71,
2.53,
2.38
]
],
"gapFillRatio": 0
}
],
"level": "2 m above gnd",
"aggregation": "none",
"code": 11,
"variable": "Temperature"
}
],
"timeIntervals": [
[
"20180101T0000",
"20180101T0100",
"20180101T0200",
"20180101T0300",
"20180101T0400",
"20180101T0500",
"20180101T0600",
"20180101T0700",
"20180101T0800",
"20180101T0900",
"20180101T1000",
"20180101T1100",
"20180101T1200",
"20180101T1300",
"20180101T1400",
"20180101T1500",
"20180101T1600",
"20180101T1700",
"20180101T1800",
"20180101T1900",
"20180101T2000",
"20180101T2100",
"20180101T2200",
"20180101T2300",
"20180102T0000",
"20180102T0100",
"20180102T0200",
"20180102T0300",
"20180102T0400",
"20180102T0500",
"20180102T0600",
"20180102T0700",
"20180102T0800",
"20180102T0900",
"20180102T1000",
"20180102T1100",
"20180102T1200",
"20180102T1300",
"20180102T1400",
"20180102T1500",
"20180102T1600",
"20180102T1700",
"20180102T1800",
"20180102T1900",
"20180102T2000",
"20180102T2100",
"20180102T2200",
"20180102T2300"
]
],
"timeResolution": "hourly"
},
{
"geometry": {
"coordinates": [
[
24.7,
59.4,
31.73
],
[
24.8,
59.4,
36.445
]
],
"locationNames": [
"59.4°N/24.7°E31.7m",
"59.4°N/24.8°E36.4m"
],
"type": "MultiPoint"
},
"domain": "NEMS12",
"codes": [
{
"unit": "°C",
"aggregation": "none",
"code": 11,
"level": "1000 mb",
"dataPerTimeInterval": [
{
"data": [
[
-0.585,
-0.265,
-0.055,
0.04,
0.044999998,
0.08,
0.11,
0.205,
0.13499999,
0.43,
0.84000003,
1.2,
2.1,
2.33,
2.5,
2.72,
3.1750002,
3.775,
4.915,
5.37,
4.16,
3.795,
4.1949997,
4.41,
4.415,
4.275,
4.1800003,
4.16,
4.0950003,
4.08,
4.185,
4.1,
3.98,
3.575,
3.22,
2.92,
4.395,
3.7649999,
2.895,
2.66,
2.6550002,
2.72,
2.845,
2.955,
2.89,
2.685,
2.54,
2.355
],
[
-0.64,
-0.29,
-0.08,
0.01,
0.03,
0.08,
0.12,
0.24,
0.14,
0.4,
0.8,
1.13,
2.11,
2.34,
2.52,
2.74,
3.19,
3.82,
4.91,
5.45,
4.29,
3.81,
4.19,
4.42,
4.43,
4.28,
4.17,
4.15,
4.08,
4.06,
4.18,
4.12,
4.01,
3.66,
3.31,
2.97,
4.38,
3.79,
2.9,
2.68,
2.68,
2.75,
2.89,
2.99,
2.88,
2.64,
2.43,
2.27
]
],
"gapFillRatio": 0
}
],
"variable": "Temperature"
}
],
"timeIntervals": [
[
"20180101T0000",
"20180101T0100",
"20180101T0200",
"20180101T0300",
"20180101T0400",
"20180101T0500",
"20180101T0600",
"20180101T0700",
"20180101T0800",
"20180101T0900",
"20180101T1000",
"20180101T1100",
"20180101T1200",
"20180101T1300",
"20180101T1400",
"20180101T1500",
"20180101T1600",
"20180101T1700",
"20180101T1800",
"20180101T1900",
"20180101T2000",
"20180101T2100",
"20180101T2200",
"20180101T2300",
"20180102T0000",
"20180102T0100",
"20180102T0200",
"20180102T0300",
"20180102T0400",
"20180102T0500",
"20180102T0600",
"20180102T0700",
"20180102T0800",
"20180102T0900",
"20180102T1000",
"20180102T1100",
"20180102T1200",
"20180102T1300",
"20180102T1400",
"20180102T1500",
"20180102T1600",
"20180102T1700",
"20180102T1800",
"20180102T1900",
"20180102T2000",
"20180102T2100",
"20180102T2200",
"20180102T2300"
]
],
"timeResolution": "hourly"
},
{
"geometry": {
"type": "MultiPoint",
"locationNames": [
"59.4°N/24.7°E31.7m",
"59.4°N/24.8°E36.4m"
],
"coordinates": [
[
24.7,
59.4,
31.73
],
[
24.8,
59.4,
36.445
]
]
},
"domain": "NEMS12",
"codes": [
{
"unit": "°C",
"dataPerTimeInterval": [
{
"data": [
[
-7.0950003,
-6.615,
-4.815,
-3.55,
-2.6750002,
-2.1950002,
-2.695,
-2.87,
-2.1399999,
-0.995,
0.1,
1,
0.335,
0.38,
-0.030000001,
-0.8,
-0.18,
0.575,
1.11,
-0.32999998,
-1.03,
-2.31,
-3.09,
-3.7350001,
-3.93,
-3.905,
-3.92,
-3.71,
-3.625,
-3.195,
-3.7,
-3.32,
-3.72,
-3.915,
-3.93,
-3.605,
-4.315,
-3.8899999,
-3.815,
-3.38,
-3.2150002,
-3.27,
-3.435,
-3.47,
-3.43,
-3.37,
-3.44,
-3.51
],
[
-7.11,
-6.73,
-4.94,
-3.57,
-2.7,
-2.15,
-2.62,
-2.91,
-2.22,
-1.1,
0.03,
0.9,
0.36,
0.37,
0.11,
-0.74,
-0.13,
0.59,
1.19,
-0.19,
-0.95,
-2.18,
-3.08,
-3.68,
-3.97,
-3.94,
-3.93,
-3.69,
-3.63,
-3.27,
-3.7,
-3.32,
-3.68,
-3.9,
-3.97,
-3.6,
-4.29,
-3.92,
-3.8,
-3.37,
-3.24,
-3.28,
-3.42,
-3.44,
-3.39,
-3.35,
-3.37,
-3.44
]
],
"gapFillRatio": 0
}
],
"level": "850 mb",
"code": 11,
"aggregation": "none",
"variable": "Temperature"
}
],
"timeResolution": "hourly",
"timeIntervals": [
[
"20180101T0000",
"20180101T0100",
"20180101T0200",
"20180101T0300",
"20180101T0400",
"20180101T0500",
"20180101T0600",
"20180101T0700",
"20180101T0800",
"20180101T0900",
"20180101T1000",
"20180101T1100",
"20180101T1200",
"20180101T1300",
"20180101T1400",
"20180101T1500",
"20180101T1600",
"20180101T1700",
"20180101T1800",
"20180101T1900",
"20180101T2000",
"20180101T2100",
"20180101T2200",
"20180101T2300",
"20180102T0000",
"20180102T0100",
"20180102T0200",
"20180102T0300",
"20180102T0400",
"20180102T0500",
"20180102T0600",
"20180102T0700",
"20180102T0800",
"20180102T0900",
"20180102T1000",
"20180102T1100",
"20180102T1200",
"20180102T1300",
"20180102T1400",
"20180102T1500",
"20180102T1600",
"20180102T1700",
"20180102T1800",
"20180102T1900",
"20180102T2000",
"20180102T2100",
"20180102T2200",
"20180102T2300"
]
]
},
{
"geometry": {
"type": "MultiPoint",
"locationNames": [
"59.4°N/24.7°E31.7m",
"59.4°N/24.8°E36.4m"
],
"coordinates": [
[
24.7,
59.4,
31.73
],
[
24.8,
59.4,
36.445
]
]
},
"domain": "NEMS12",
"codes": [
{
"unit": "°C",
"dataPerTimeInterval": [
{
"data": [
[
-10.84,
-12,
-10.280001,
-8.865,
-8.5,
-7.7,
-7.5699997,
-7.655,
-8.434999,
-8.844999,
-8.700001,
-7.1549997,
-9.555,
-10.004999,
-7.885,
-8.32,
-8.370001,
-8.915,
-9.53,
-10.225,
-10.934999,
-11.12,
-11.434999,
-11.575,
-11.965,
-11.64,
-12.12,
-12.345,
-12.34,
-12.48,
-12.844999,
-13.174999,
-13.18,
-13.219999,
-13.434999,
-13.305,
-12.775,
-12.745,
-12.79,
-12.75,
-12.690001,
-12.77,
-12.77,
-12.76,
-12.67,
-12.605,
-12.635,
-12.695
],
[
-10.74,
-11.94,
-10.54,
-8.77,
-8.56,
-7.75,
-7.52,
-7.53,
-8.24,
-8.95,
-8.77,
-7.15,
-9.48,
-10.03,
-7.88,
-8.24,
-8.35,
-8.82,
-9.4,
-10.08,
-10.84,
-11.04,
-11.3,
-11.5,
-11.9,
-11.6,
-12.09,
-12.31,
-12.39,
-12.48,
-12.83,
-13.16,
-13.2,
-13.19,
-13.4,
-13.3,
-12.77,
-12.7,
-12.78,
-12.71,
-12.66,
-12.73,
-12.73,
-12.72,
-12.62,
-12.57,
-12.6,
-12.67
]
],
"gapFillRatio": 0
}
],
"code": 11,
"level": "700 mb",
"aggregation": "none",
"variable": "Temperature"
}
],
"timeResolution": "hourly",
"timeIntervals": [
[
"20180101T0000",
"20180101T0100",
"20180101T0200",
"20180101T0300",
"20180101T0400",
"20180101T0500",
"20180101T0600",
"20180101T0700",
"20180101T0800",
"20180101T0900",
"20180101T1000",
"20180101T1100",
"20180101T1200",
"20180101T1300",
"20180101T1400",
"20180101T1500",
"20180101T1600",
"20180101T1700",
"20180101T1800",
"20180101T1900",
"20180101T2000",
"20180101T2100",
"20180101T2200",
"20180101T2300",
"20180102T0000",
"20180102T0100",
"20180102T0200",
"20180102T0300",
"20180102T0400",
"20180102T0500",
"20180102T0600",
"20180102T0700",
"20180102T0800",
"20180102T0900",
"20180102T1000",
"20180102T1100",
"20180102T1200",
"20180102T1300",
"20180102T1400",
"20180102T1500",
"20180102T1600",
"20180102T1700",
"20180102T1800",
"20180102T1900",
"20180102T2000",
"20180102T2100",
"20180102T2200",
"20180102T2300"
]
]
},
{
"geometry": {
"type": "MultiPoint",
"locationNames": [
"59.4°N/24.7°E",
"59.4°N/24.8°E"
],
"coordinates": [
[
24.7,
59.4,
"NaN"
],
[
24.8,
59.4,
"NaN"
]
]
},
"domain": "CAMSGLOBAL",
"codes": [
{
"unit": "",
"dataPerTimeInterval": [
{
"data": [
[
"NaN",
"NaN",
"NaN",
"NaN",
"NaN",
"NaN",
"NaN",
"NaN",
"NaN",
"NaN",
"NaN",
"NaN",
"NaN",
"NaN",
"NaN",
"NaN",
"NaN",
"NaN",
"NaN",
"NaN",
"NaN",
"NaN",
"NaN",
"NaN",
"NaN",
"NaN",
"NaN",
"NaN",
"NaN",
"NaN",
"NaN",
"NaN",
"NaN",
"NaN",
"NaN",
"NaN",
"NaN",
"NaN",
"NaN",
"NaN",
"NaN",
"NaN",
"NaN",
"NaN",
"NaN",
"NaN",
"NaN",
"NaN"
],
[
"NaN",
"NaN",
"NaN",
"NaN",
"NaN",
"NaN",
"NaN",
"NaN",
"NaN",
"NaN",
"NaN",
"NaN",
"NaN",
"NaN",
"NaN",
"NaN",
"NaN",
"NaN",
"NaN",
"NaN",
"NaN",
"NaN",
"NaN",
"NaN",
"NaN",
"NaN",
"NaN",
"NaN",
"NaN",
"NaN",
"NaN",
"NaN",
"NaN",
"NaN",
"NaN",
"NaN",
"NaN",
"NaN",
"NaN",
"NaN",
"NaN",
"NaN",
"NaN",
"NaN",
"NaN",
"NaN",
"NaN",
"NaN"
]
],
"gapFillRatio": 0
}
],
"code": 706,
"level": "sfc",
"aggregation": "none",
"variable": "Air Quality Index"
}
],
"timeResolution": "hourly",
"timeIntervals": [
[
"20180101T0000",
"20180101T0100",
"20180101T0200",
"20180101T0300",
"20180101T0400",
"20180101T0500",
"20180101T0600",
"20180101T0700",
"20180101T0800",
"20180101T0900",
"20180101T1000",
"20180101T1100",
"20180101T1200",
"20180101T1300",
"20180101T1400",
"20180101T1500",
"20180101T1600",
"20180101T1700",
"20180101T1800",
"20180101T1900",
"20180101T2000",
"20180101T2100",
"20180101T2200",
"20180101T2300",
"20180102T0000",
"20180102T0100",
"20180102T0200",
"20180102T0300",
"20180102T0400",
"20180102T0500",
"20180102T0600",
"20180102T0700",
"20180102T0800",
"20180102T0900",
"20180102T1000",
"20180102T1100",
"20180102T1200",
"20180102T1300",
"20180102T1400",
"20180102T1500",
"20180102T1600",
"20180102T1700",
"20180102T1800",
"20180102T1900",
"20180102T2000",
"20180102T2100",
"20180102T2200",
"20180102T2300"
]
]
}
]
Given your first example, you could then process it like this to separate the json array into individual objects and stuff them into a table as separate rows:
create table real_json as select value::jsonb from temp_json join lateral json_array_elements(values::json) on true;
However, this depends on the large single json object always being small enough to fit comfortably into an amount of memory you are willing to use, which seems like a dubious proposition. You need a library which does incremental or streaming parsing on the JSON object, returning one 2nd level object at a time and then clearing it from memory one returned. I don't think that PostgreSQL provides such a facility. If you let us know what your favorite programming language is, perhaps someone can propose a specific library.
Alternatively, you could whip up a quick and dirty script that divides the JSON into lines for separate records based on the assumption that the indenting of the "pretty" file is always correct, and so using "^ [{}]" as markers, and then strips out the newlines to reverse the "pretty" formatting so that each record is a single line. If you had such a script, you could then do:
\copy real_json FROM PROGRAM 'unnest_top_array_and_depretty /home/cae/test_without_new_lines.json';
Same code of #jjanes with a real, working command line tool.
\copy json_table FROM PROGRAM 'jq --stream -nc -f myfile.json';
Removing the ''pretty format'' from the file helped in using the COPY function but it puts the whole content of the file in one row, making it impossible to run a simple SELECT query on an existing column ...
Here is what I used :
CREATE TEMP TABLE target(data jsonb);
copy target from '/home/cae/test_without_new_lines.json';