Cypher query: How to compute cosine similarity in Agensgraph, and SAP HANA - cypher

I wanna test simple graph analysis performance among GraphDBes using cypher.
I referred this site and reproduce the example in Neo4j, Agensgraph, SAP HANA, and Redis.
but the cypher query(see below) is not operate in Agensgraph, and SAP HANA.
MATCH (p1:Person {name:'Michael Sherman'})-[r1:RATED]->(m:Movie)<-[r2:RATED]-(p2:Person {name:'Michael Hunger'}) RETURN m.name AS Movie, r1.rating AS `M. Sherman's Rating`, r2.rating AS `M. Hunger's Rating`
I think the second arrow pattern doesn't works in Agensgraph, and SAP HANA.
How can I edit this query to operate in Agensgraph, and SAP HANA?

I already take similar problem that caused by difference of query language grammar.
Here is my query to figure out similarity in AgesnGraph. You'll be able to compute cosine similarity with few modification.
MATCH (u1:users {userid:'Toby'})-[r1:hasrated]->(m:movies)<-[r2:hasrated]-(u2:users)
WITH distinct u1.userid as u1name, u2.userid as u2name,
SUM(r1.rating::float*r2.rating::float) as xyDotProduct,
SQRT(SUM(r1.rating::float^2)) as xLength,
SQRT(SUM(r2.rating::float^2)) as yLength
RETURN u1name, u2name, xyDotProduct::float/(xLength::float*yLength::float) as similarity;

Related

JOIN with a dataset

I am very new to Superset and SQL in general, please excuse my poor language as well.
General question: How do I use an existing superset dataset in a sql query?
Case: I am trying to create a map based on german postal codes. Therefor I need to join that table with a translation table containing german postal code to JSON coordinates. The translation table is in another database than the german postal codes are. I am constantly trying to JOIN these both together, but it does not work. I assume you can only work with the data from one single database at once. Is it possible to create datasets with the needed data and reuse these datasets in a sql query? I tried this, but I dont know how to access these. When using data on a database I would write:
Select * from database.table
To access a superset dataset in my query:
Select * from dataset (how it is named in the superset dataset list)
which does not work at all.
I am desperatly trying to solve this problem but I am just not able to.
Thanks for your help in advance.
In Superset's SQL Lab, you can run pretty much any valid SQL query that your database accepts. The query will more / less be sent to your database and the results displayed to you in the results panel. So you can run JOIN queries in SQL Lab, for example.
If you want to visualize data from the results of a SQL Query, hit the "Explore" button after running the query. Then, you'll be asked to publish the query you wrote & ran as a Virtual Dataset. Finally, you'll be taken to the Explore, no-code chart builder to visualize your data.
I wrote a bit more about the semantic layer in Superset here, if you'd like to learn more: https://preset.io/blog/understanding-superset-semantic-layer/

Apply dynamic filter operator in SAP HANA

We are using HANA 1.0 SPS12. Is it possible to handle filter condition dynamically in HANA graphical calculation view.
IS it possible to program this 'AND'/'OR' by parameter.
I was able to handle the and/or using the following, my example was about searching for year and quarter or for a year and all quarters:
(match("YEAR",'$$year$$') or in("YEAR",'$$year$$')) and (match("QUARTER",'$$quarter$$') or in ("QUARTER",'$$quarter$$'))
Perhaps this could work for you:
(match("CITY",'$$IP_CITY_FILTER$$') or in("CITY",'$$IP_CITY_FILTER$$')) and (match("AMT",'5000') or in ("AMT",'5000'))

Learning ExecuteSQL in FMP12, a few questions

I have joined a new job where I am required to use FileMaker (and gradually transition systems to other databases). I have been a DB Admin of a MS SQL Server database for ~2 years, and I am very well versed in PL/SQL and T-SQL. I am trying to pan my SQL knowledge to FMP using the ExecuteSQL functionaloty, and I'm kinda running into a lot of small pains :)
I have 2 tables: Movies and Genres. The relevant columns are:
Movies(MovieId, MovieName, GenreId, Rating)
Genres(GenreId, GenreName)
I'm trying to find the movie with the highest rating in each genre. The SQL query for this would be:
SELECT M.MovieName
FROM Movies M INNER JOIN Genres G ON M.GenreId=G.GenreId
WHERE M.Rating=
(
SELECT MAX(Rating) FROM Movies WHERE GenreId = M.GenreId
)
I translated this as best as I could to an ExecuteSQL query:
ExecuteSQL ("
SELECT M::MovieName FROM Movies M INNER JOIN Genres G ON M::GenreId=G::GenreId
WHERE M::Rating =
(SELECT MAX(M2::Rating) FROM Movies M2 WHERE M2::GenreId = M::GenreId)
"; "" ; "")
I set the field type to Text and also ensured values are not stored. But all I see are '?' marks.
What am I doing incorrectly here? I'm sorry if it's something really stupid, but I'm new to FMP and any suggestions would be appreciated.
Thank you!
--
Ram
UPDATE: Solution and the thought process it took to get there:
Thanks to everyone that helped me solve the problem. You guys made me realize that traditional SQL thought process does not exactly pan to FMP, and when I probed around, what I realized is that to best use SQL knowledge in FMP, I should be considering each column independently and not think of the entire result set when I write a query. This would mean that for my current functionality, the JOIN is no longer necessary. The JOIN was to bring in the GenreName, which is a different column that FMP automatically maps. I just needed to remove the JOIN, and it works perfectly.
TL;DR: The thought process context should be the current column, not the entire expected result set.
Once again, thank you #MissJack, #Chuck (how did you even get that username?), #pft221 and #michael.hor257k
I've found that FileMaker is very particular in its formatting of queries using the ExecuteSQL function. In many cases, standard SQL syntax will work fine, but in some cases you have to make some slight (but important) tweaks.
I can see two things here that might be causing the problem...
ExecuteSQL ("
SELECT M::MovieName FROM Movies M INNER JOIN Genres G ON
M::GenreId=G::GenreId
WHERE M::Rating =
(SELECT MAX(M2::Rating) FROM Movies M2 WHERE M2::GenreId = M::GenreId)
"; "" ; "")
You can't use the standard FMP table::field format inside the query.
Within the quotes inside the ExecuteSQL function, you should follow the SQL format of table.column. So M::MovieName should be M.MovieName.
I don't see an AS anywhere in your code.
In order to create an alias, you must state it explicitly. For example, in your FROM, it should be Movies AS M.
I think if you fix those two things, it should probably work. However, I've had some trouble with JOINs myself, as my primary experience is with FMP, and I'm only just now becoming more familiar with SQL syntax.
Because it's incredibly hard to debug SQL in FMP, the best advice I can give you here is to start small. Begin with a very basic query, and once you're sure that's working, gradually add more complicated elements one at a time until you encounter the dreaded ?.
There's a number of great posts on FileMaker Hacks all about ExecuteSQL:
Since you're already familiar with SQL, I'd start with this one: The Missing FM 12 ExecuteSQL Reference. There's a link to a PDF of the entire article if you scroll down to the bottom of the post.
I was going to recommend a few more specific articles (like the series on Robust Coding, or Dynamic Parameters), but since I'm new here and I can't include more than 2 links, just go to FileMaker Hacks and search for "ExecuteSQL". You'll find a number of useful posts.
NB If you're using FMP Advanced, the Data Viewer is a great tool for testing SQL. But beware: complex queries on large databases can sometimes send it into fits and freeze the program.
The first thing to keep in mind when working with FileMaker and ExecuteSQL() is the difference between tables and table occurrences. This is a concept that's somewhat unique to FileMaker. Succinctly, tables store the data, but table occurrences define the context of that data. Table occurrences are what you're seeing in FileMaker's relationship graph, and the ExecuteSQL() function needs to reference the table occurrences in its query.
I agree with MissJack regarding the need to start small in building the SQL statement and use the Data Viewer in FileMaker Pro Advanced, but there's one more recommendation I can offer, which is to use SeedCode's SQL Explorer. It does require the adding of table occurrences and fields to duplicate the naming in your existing solution, but this is pretty easy to do and the file they offer includes a wizard for building the SQL query.

SQL queries to their natural language description

Are there any open source tools that can generate a natural language description of a given SQL query? If not, some general pointers would be appreciated.
I don't know much about NLP, so I am not sure how difficult this is, although I saw from some previous discussion that the vice versa conversion is still an active area of research. It might help to say that the SQL tables I will be handling are not arbitrary in any sense, yet mine, which means that I know exact semantics of each table and its columns.
I can devise two approaches:
SQL was intended to be "legible" to non-technical people. A naïve and simpler way would be to perform a series of replacements right on the SQL query: "SELECT" -> "display"; "X=Y" -> "when the field X equals to value Y"... in this approach, using functions may be problematic.
Use a SQL parser and use a series of templates to realize the parsed structure in a textual form: "(SELECT (SUM(X)) (FROM (Y)))" -> "(display (the summation of (X)) (in the table (Y))"...
ANTLR has a grammar of SQL you can use: https://github.com/antlr/grammars-v4/blob/master/sqlite/SQLite.g4 and there are a couple SQL parsers:
http://www.sqlparser.com/sql-parser-java.php
https://github.com/facebook/presto/tree/master/presto-parser/src/main
http://db.apache.org/derby/
Parsing is a core process for executing a SQL query, check this for more information: https://decipherinfosys.wordpress.com/2007/04/19/parsing-of-sql-statements/
There is a new project (I am part of) called JustQuery.Me which intends to do just that with NLP and google's SyntaxNet. You can go to the https://github.com/justquery-me/justqueryme page for more info. Also, sign up for the mailing list at justqueryme-development#googlegroups.com and we will notify you when we have a proof of concept ready.

ebean in play 2 framework with SQL dialect

I'm trying to establish friendship between Teradata 13.10 and play 2 framework using ebean ORM layer. My app does try to query DB:
select t0.workflow_id c0, t0.CHNL_TYPE_CD c1, t0.WORKFLOW_NAME c2, t0.INFO_SYSTEM_TYPE_CD c3, t0.FOLDER_NAME c4 from ETL_WORKFLOW t0 order by name limit 11
The problem is... that Teradata does know nothing about LIMIT Is there any possibility to find implementation/override something and make underlying ORM work with Teradata?
UPD:
Seems like I have to do something with tese classes:
http://www.avaje.org/static/javadoc/pub/index.html
I'm looking for samples:
1. Set proper SQL dialect for ebean or make it work in SQL ANSI mode.
2. Override classes for ebean and write own implementation of LIMIT functionality.
Teradata supports both the TOP n operator and SAMPLE clause in your SELECT statement. TOP n being an extension to the ANSI SQL 2008 standard is the closest equivalent to the LIMIT n operator you are looking for.
TOP n is processed after all the other clauses in your SELECT statement have been satisfied. It is the functional replacement of using the QUALIFY ROW_NUMBER() or QUALIFY RANK() window aggregates to accomplish the same task providing at worst equivalent performance of the window aggregate functions.
SAMPLE provides you some additional flexibility by allowing you to returned multiple sample sets within a single result. It can also be used for simple, random samples from a resultset as well. Given the options that are available with SAMPLE you are best served with referring to the SQL Data Manipulation Language manual for Teradata for all of the details. The Teradata manuals can be downloaded from here. Just select which version of Teradata you wish to download the manuals for.
Edit:
Using the RawSQL feature with Ebean you may be able use either the SAMPLE or TOP n operators in your SQL explicitly and not allow Ebean to add expressions such as the LIMIT OFFSET clause automatically. Have you tried this approach yet?