ERROR: only one relationship type is allowed for MERGE open cypher query agensgraph - cypher

Is the below realtionship creation allowed in open cypher query I am trying this in agensgraph,
MATCH (mc: mat_comp)
MATCH (p:plant)
MATCH (mb: material)
WHERE mc.component = mb.material and mc.plant=p.b_plant
MERGE (mc) <- [ comp_2_p] - ( p)
;
ERROR: only one relationship type is allowed for MERGE
What mistake I am doing... because mat_comp and plant nodes has plant as common mat_comp and material nodes has material as common
mat_comp
has material column
plant column
p: plant
mb:
material colum
MATCH (mc: mat_comp)
MATCH (p:plant)
MATCH (mb: material)
WHERE mc.comp = mb.material and mc.plant=p.b_plant
RETURN mc.comp, mb.material, mc.plant, p.b_plant;
comp | material | plant | b_plant
------------+------------+-------+--------------
"10" | "10" | "33" | "33"
=# \d material
material | character varying(50) | | | |
b_plant | character varying(50) |
=# \d mat_comp
material | character varying(50) | | | |
comp | character varying(50) | | | |
plant | character varying(50) | | | |
-# \d plant
--------------+-----------------------+-----------+----------+---------+-------------
b_plant | character varying(50) |

I think your query has a simple mistake in MERGE clause.
To cut to the point. You can use this query
MERGE (mc) <- [:comp_2_p] - (p)
The mistake is colon.
The word after the colon is type of component (such as node and edge).
The word before the colon is like a pronoun in Cypher clause (I don't know its exact name.).
If you still have another ERROR using above query, you should check whether the elabel "comp_2_p" is declared or not.

Related

Error in condition in where clause in timescale db while visualising in grafana

I am trying to visualise in Grafana from timescale db with the following query
SELECT $__timeGroup(timestamp,'30m'), sum(error) as Error
FROM userCounts
WHERE serviceid IN ($Service) AND ciclusterid IN ($CiClusterId)
AND environment IN ($environment) AND filterid IN ($filterId)
AND $__timeFilter("timestamp")
GROUP BY timestamp;
however it gives an error and no data shows when i add the filterid IN ($filterId) part
have checked the variable names a thousand times but not sure what is error. Logically if the filters for variables are working in other conditions , it should work here also. not sure what is going wrong. Can anyone give input ?
Edit:
The schema is like
timestamp | timestamp without time zone | | not nul
l |
measurement | character varying(150) | |
|
filterid | character varying(150) | |
|
environment | character varying(150) | |
|
iscanary | boolean | |
|
servicename | character varying(150) | |
|
serviceid | character varying(150) | |
|
ciclusterid | character varying(150) | |
--more--
In grafana , it is giving the error
pq: column "in_orgs_that_have_had_an_operational_connector" does not exist
Where filterId = IN_ORGS_THAT_HAVE_HAD_AN_OPERATIONAL_CONNECTOR is selected, it is a value and not a column so not sure why they mentioned that, also they are showing in lower case while the value is in uppercase

Oracle SQL regex extraction

I have data as follows in a column
+----------------------+
| my_column |
+----------------------+
| test_PC_xyz_blah |
| test_PC_pqrs_bloh |
| test_Mobile_pqrs_bleh|
+----------------------+
How can I extract the following as columns?
+----------+-------+
| Platform | Value |
+----------+-------+
| PC | xyz |
| PC | pqrs |
| Mobile | pqrs |
+----------+-------+
I tried using REGEXP_SUBSTR
Default first pattern occurrence for platform:
select regexp_substr(my_column, 'test_(.*)_(.*)_(.*)') as platform from table
Getting second pattern occurrence for value:
select regexp_substr(my_column, 'test_(.*)_(.*)_(.*)', 1, 2) as value from table
This isn't working, however. Where am I going wrong?
For Non-empty tokens
select regexp_substr(my_column,'[^_]+',1,2) as platform
,regexp_substr(my_column,'[^_]+',1,3) as value
from my_table
;
For possibly empty tokens
select regexp_substr(my_column,'^.*?_(.*)?_.*?_.*$',1,1,'',1) as platform
,regexp_substr(my_column,'^.*?_.*?_(.*)?_.*$',1,1,'',1) as value
from my_table
;
+----------+-------+
| PLATFORM | VALUE |
+----------+-------+
| PC | xyz |
+----------+-------+
| PC | pqrs |
+----------+-------+
| Mobile | pqrs |
+----------+-------+
(.*) is greedy by nature, it will match all character including _ character as well, so test_(.*) will match whole of your string. Hence further groups in pattern _(.*)_(.*) have nothing to match, whole regex fails. The trick is to match all characters excluding _. This can be done by defining a group ([^_]+). This group defines a negative character set and it will match to any character except for _ . If you have better pattern, you can use them like [A-Za-z] or [:alphanum]. Once you slice your string to multiple sub strings separated by _, then just select 2nd and 3rd group.
ex:
SELECT REGEXP_SUBSTR( my_column,'(([^_]+))',1,2) as platform, REGEXP_SUBSTR( my_column,'(([^_]+))',1,3) as value from table;
Note: AFAIK there is no straight forward method to Oracle to exact matching groups. You can use regexp_replace for this purpose, but it unlike capabilities of other programming language where you can exact just group 2 and group 3. See this link for example.

Remove invalid data based on particular pattern SQL Server

I have a sample data like shown below
------------------------------------------------
| ID | Column 1 | Column 2 |
------------------------------------------------
| 1 | 0229-10010 |Valid |
------------------------------------------------
| 2 | 20483 |InValid |
------------------------------------------------
| 3 | 319574R06-STAT |Valid |
------------------------------------------------
| 4 | ,,,,,,,,,,,,,,1,,,,,,, |InValid |
------------------------------------------------
| 5 | "PBOM-SSE, CHAMBER" |Valid |
------------------------------------------------
| 6 | ""PBOM-SSE, CHAMBER |InValid |
------------------------------------------------
| 7 | "PBOM-SSE CHAMBER", |InValid |
------------------------------------------------
| 8 | #DRM-1102.Z |InValid |
------------------------------------------------
| 9 | DRM#1102.Z |Valid |
------------------------------------------------
| 10 |OEM-2-202 4079 KALREZ |Valid |
------------------------------------------------
| 11 |-OEM2202 4079 KALREZ# |InValid |
------------------------------------------------
What i want to do is i need to create a pattern in such a way that i need to fetch only invalid data. Just for representation i have mentioned Valid and Invalid. In my table i don't have any flag as such.
Here the trick is same, wildcard characters appearing at different places makes different sense. Consider record ID-5 and Id-6. In both the cases wildcard characters are same, but the position decides whether its valid or not. Again position is also not so clear. I guess you can make out why particular record in column 1 is valid and invalid. In record 8, '#' before that item doesn't makes sense, where as # after Alphabet makes sense (in record 9).
In record 2, there are lot of blank spaces before number, that's why its invalid, but that doesn't mean that space itself is wild card.
I have written query like below.
SELECT [PartNumber]
FROM [IBSSSystems].[dbo].[Part]
WHERE (PartNumber LIKE '%[?;.,$^#&*{}:"<>/|\ %'']%'
OR PartNumber LIKE '%[%'
OR PartNumber LIKE '%]%')
The above query understands that whenever it see any wildcard character in a record , it fetches that. But I need the query in such a way that it understands and fetches only invalid data. I guess there will be lot of And and Or in the resulting query, but i'm confused. I hope you can help me out. Thanks in advance.
SELECT [PartNumber]
FROM [IBSSSystems].[dbo].[Part]
WHERE (PartNumber LIKE '[^A-Za-z0-9"]%' ESCAPE '\' -- When the First character is special charater its InValid ( " is an exception)
OR PartNumber LIKE '%[^A-Za-z0-9" ]' ESCAPE '\' -- When the Last character is special charater its InValid ( " is an exception, also trailing spaces are exception)
OR PartNumber LIKE '%[^A-Za-z0-9 ][^A-Za-z0-9 ]%' -- When there are two or more consecutive special charaters its InValid
OR PartNumber LIKE '%[\^\[\]\\_?;$#&*{}:<>/|''~`]%' ESCAPE '\' -- Add characters here which do not allowed to have any occurrence in the string
)

Postgis ST_Distance between zip codes on Postgress SQL 9.x

this is more of a SQL question than a PostGIS question, but I'm getting stuck again :(
I have a table called referred with id numbers in the "from" and "to" columns.
I want to calculate the distance between ALL these id numbers based on their zip code.
There is a separate reference table called doc which contains the id number in column "NPI" and zip code in column "Provider Business Mailing Address Postal Code" and a separate geo table called zctas which has zip code column as zcta and geom column.
For example, this query works fine:
SELECT z.zcta As zip1,
z2.zcta As zip2,
ST_Distance(z.geom,z2.geom) As thedistance
FROM zctas z,
zctas z2
WHERE z2.zcta = '60611'
AND z.zcta='19611';
One catch is that the "Provider Business Mailing Address Postal Code" should = left("Provider Business Mailing Address Postal Code", 5).
I'm getting stuck on JOIN-ing the 2 zip codes from the reference table in this one query.
Sample table:
referred table:
from | to | count
------------+------------+-------
1174589766 | 1538109665 | 108
1285653204 | 1982604013 | 31
desired output:
from | to | count | distance
------------+------------+----------------
1174589766 | 1538109665 | 108 | 53434
1285653204 | 1982604013 | 31 | 34234
\d+
Table "public.zctas"
Column | Type | Modifiers | Storage | Stats target | Description
------------------+------------------------+-----------+----------+--------------+-------------
state | character(2) | | extended | |
zcta | character(5) | | extended | |
junk | character varying(100) | | extended | |
population_tot | bigint | | plain | |
housing_tot | bigint | | plain | |
water_area_meter | double precision | | plain | |
land_area_meter | double precision | | plain | |
water_area_mile | double precision | | plain | |
land_area_mile | double precision | | plain | |
latitude | double precision | | plain | |
longitude | double precision | | plain | |
thepoint_lonlat | geometry(Point,4269) | | main | |
thepoint_meter | geometry(Point,32661) | not null | main | |
geom | geometry(Point,32661) | | main | |
Indexes:
"idx_zctas_thepoint_lonlat" gist (thepoint_lonlat)
"idx_zctas_thepoint_meter" gist (thepoint_meter) CLUSTER
Table "public.referred"
Column | Type | Modifiers | Storage | Stats target | Description
--------+-----------------------+-----------+----------+--------------+-------------
from | character varying(25) | | extended | |
to | character varying(25) | | extended | |
count | integer | | plain | |
Has OIDs: no
Table "public.doc"
Column | Type | Modifiers | Storage | Stats target | Description
--------------------------------------------------------------+------------------------+-----------+----------+--------------+-------------
NPI | character varying(255) | | extended | |
Entity Type Code | character varying(255) | | extended | |
Replacement NPI | character varying(255) | | extended | |
Employer Identification Number (EIN) | character varying(255) | | extended | |
Provider Organization Name (Legal Business Name) | character varying(255) | | extended | |
Provider Last Name (Legal Name) | character varying(255) | | extended | |
Provider First Name | character varying(255) | | extended | |
Provider Middle Name | character varying(255) | | extended | |
Provider Name Prefix Text | character varying(255) | | extended | |
Provider Name Suffix Text | character varying(255) | | extended | |
Provider Credential Text | character varying(255) | | extended | |
Provider Other Organization Name | character varying(255) | | extended | |
Provider Other Organization Name Type Code | character varying(255) | | extended | |
Provider Other Last Name | character varying(255) | | extended | |
Provider Other First Name | character varying(255) | | extended | |
Provider Other Middle Name | character varying(255) | | extended | |
Provider Other Name Prefix Text | character varying(255) | | extended | |
Provider Other Name Suffix Text | character varying(255) | | extended | |
Provider Other Credential Text | character varying(255) | | extended | |
Provider Other Last Name Type Code | character varying(255) | | extended | |
g(255) | | extended | |
Provider Second Line Business Mailing Address | character varying(255) | | extended | |
Provider Business Mailing Address City Name | character varying(255) | | extended | |
Provider Business Mailing Address State Name | character varying(255) | | extended | |
Provider Business Mailing Address Postal Code | character varying(255) | | extended | . . . . other columns not really needed.
Thanks!!!!
This should be relatively straightforward.
Assuming the NPIs are actually all the same length in doc and referred, you can join those tables quite easily:
SELECT ad."Provider Business Mailing Address Postal Code" as a_zip,
bd."Provider Business Mailing Address Postal Code" as b_zip,
r."count"
FROM referred r
LEFT JOIN doc ad ON r."from" = ad."NPI"
LEFT JOIN doc bd ON r."from" = bd."NPI";
Obviously, adjust this join based on careful analysis of the NPI and from/to fields in your data. Add trim or left method calls within the join if necessary -- the most important thing is that the JOIN condition be on comparable data.
Now, going from this to your original query to find a distance is trivial:
SELECT ad."Provider Business Mailing Address Postal Code" as a_zip,
bd."Provider Business Mailing Address Postal Code" as b_zip,
r."count",
ST_Distance(az.geom,bz.geom) As thedistance
FROM referred r
LEFT JOIN doc ad ON r."from" = ad."NPI"
LEFT JOIN doc bd ON r."from" = bd."NPI"
LEFT JOIN zctas az
ON az.zcta = left(ad."Provider Business Mailing Address Postal Code",5)
LEFT JOIN zctas bz
ON bz.zcta = left(bd."Provider Business Mailing Address Postal Code",5)
This is just one construction that should work, many others are possible. This particular construction will ensure that every entry in referred is represented, even if the NPI doesn't match to an entry in the doc table, or a zipcode can't be matched against the zctas table.
On the flip side, if there exists more than one entry for an NPI in the doc table, any referred entry that mentions this duplicated NPI will also be duplicated.
Similarly, if there is more than one entry in zctas for a particular zip code (zcta), you would see duplicates of referred rows.
That's how LEFT JOIN works, but I figured it was worth putting in the warning, as Provider data is typically full of duplicates against NPI, and there are often duplicate zip codes in zip code lookup lists as some zip codes cross state lines.

COUNT and GROUP BY on text fields seems slow

I'm building a MySQL database which contains entries about special substrings of DNA in species of yeast. My table looks like this:
+--------------+---------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+--------------+---------+------+-----+---------+-------+
| species | text | YES | MUL | NULL | |
| region | text | YES | MUL | NULL | |
| gene | text | YES | MUL | NULL | |
| startPos | int(11) | YES | | NULL | |
| repeatLength | int(11) | YES | | NULL | |
| coreLength | int(11) | YES | | NULL | |
| sequence | text | YES | MUL | NULL | |
+--------------+---------+------+-----+---------+-------+
There are approximately 1.8 million records. In one type of query I want to see how many DNA substrings are associated with each type of species and region, so I issue this query:
select species, region, count(*) group by species, region;
The species and region columns have only two possible entries (conserved/scer for species, and promoter/coding for region) yet this query takes about 30 seconds.
Is this a normal amount of time to expect for this type of query given the size of the table? Is it slow because I'm using text fields instead of simple integer or boolean values (I prefer text fields as several non-CS researchers will be using the DB). Any other ideas and suggestions would be welcome.
Please excuse if this is a boneheaded question, I am an SQL neophyte.
P.S. I've also seen this question but the proposed solution doesn't seem relevant for what I'm doing.
EDIT: Converting those fields to VARCHARs reduced the runtime to ~2.5 seconds. Note I also timed it against ENUMs which had a similar timing.
Why're all your string based columns defined as TEXT? If you read the performance comparison, you'll see that TEXT was ~3x slower than a VARCHAR column using identical indexing: http://forums.mysql.com/read.php?24,105964,105964
If your fields are only ever going to have 2 values, you're much better off making them booleans. You should also make everything NOT NULL unless there's a real reason you'll need it to be NULL.
Also take a look at the ENUM type for a better way to use a finite number of human-readable values for a column.
As for slowness, the first thing to try is to create indices on your columns. For the particular query you're showing here, an index on species, region should make a huge difference:
create index on mytablename (species, region);
should do it.