Get the difference from data in Swift and data in database - sql

A table in database have two column: ID and Value
In my project there are another data in a dictionary that key is ID and value is Value.
I want to get difference: Data that are in the dictionary and are not in the database. If these both data were in database, I could use SQL commands "Except" or "Not Exist" to get the difference like below image.
What is the best way to do this?
I use SQLiteDB that the result of query is a dictionary like this:
[["ID":"id1", "Value": "val1"], ["ID":"id2", "Value": "val2"],...]
Also notice that both columns should be considered while compare these two data (dictionary and data in db).

// here we get intersection of new data and data we have
let intersection = dataSet1.filtered { data in dataSet2.contains(where: { $0.id == data.id })}
// delete elements from dataSet1 which belongs to intersection
let dataSet1MinusDataSet2 = dataSet1.filter { data in intersection.contains(where: { data.id == $0.id })}
I've written code here without any Xcode so errors in syntax is possible but I think you will get the idea

Related

Filter based on 2 JSON properties after using json_extract

Imported database tables :
id | JSON
-------------|---------
Signed 32int | Raw JSON
It is easier to search via the properties of the JSON data than by id of the row itself. Each piece of JSON data contains (for this demo):
json: {
displayProperties: {},
hash: "foo"
itemType: "bar"
}
When I select I would like to matching hash, and then filter those results by a matching itemType.
My query :
SELECT json_extract(ItemDefinition.json, '$')
FROM ItemDefinition, json_tree(ItemDefinition.json, '$')
WHERE json_tree.key = 'hash' AND json_tree.value IN ${hashList}
However this returns every item that has a matching hash value. From here, I would like to also filter by key: itemType and value: "19". So I tried :
SELECT json_extract(ItemDefinition.json, '$')
FROM ItemDefinition, json_tree(ItemDefinition.json, '$')
WHERE json_tree.key = 'hash' AND json_tree.value IN ${hashList}
AND WHERE json_tree.key = 'itemType' AND json_tree.value = 19
But this isn't syntactically correct, let alone output what I am looking for. Error:
SQLITE_ERROR: near "WHERE": syntax error
The title of the question turned out to not be accurate to what I was looking for. I miss-understood what json_tree actually did. json_tree actually builds a new object with values that are filled in by the database.
What I was actually looking for was to filter by a specific value in the json column, which can be achieved by json_extract. json_extract('{column}', $.{filterValue}) will pull the raw json object out of the json column
This is the query that is working for me now:
SELECT json_extract(ItemDefinition.json, '$')
FROM ItemDefinition, json_tree(ItemDefinition.json, '$')
WHERE json_tree.key = 'hash'
AND json_tree.value IN ${hashList}
AND json_extract(ItemDefinition.json, '$.itemType') = 19
This selects the json column from ItemDefinition
Creates a json_tree from the json column
Filters results by json tree key and value
Finally filters by the property itemType from the raw json column

operator (plus add) in entity framework core send an error

When using sql select I need a counter for my sequence row like this:
var result = from d in data
select new[]
{
Convert.ToString((count++))
};
but this syntax gives the following error message:
An expression tree may not contain an assignment operator Vehicle.app..NETCoreApp,Version=v1.0
When using LINQ (e.g. from x in data select { ... }) the expression in the selection is translated into a SQL statement and executed in the database. How would an assignment like count++ work in a database? I think you're better off with selecting the records from the database and apply the sequence number on the in memory list of data rather than on the queryable collection which represents the database. For example:
var data = (from d in data select new { ... }).ToList();
var result = data.Select(new { id = Convert.ToString(count++), ... }).ToList();

How can I store the results of a SQL query as a hash with unique keys?

I have a query that returns multiple rows:
select id,status from store where last_entry = <given_date>;
The returned rows look like:
id status
-----------------
1131A correct
1132B incorrect
1134G empty
I want to store the results like this:
$rows = [
{
ID1 => '1131A',
status1 => 'correct'
},
{
ID2 => '1132B',
status2 => 'incorrect'
},
{
ID3 => '1134G',
status3 => 'empty'
}
];
How can I do this?
What you are looking for is a hash of hash in Perl. What you do is
Iterate over the results of your query.
Split each entry by tab
Create a hash with the id as key and status as value
Now to store the hash created by each such query you create another hash. Here the key could be something like 'given_date' in your case so you could write
$parent_hash{given_date}=\%child_hash
This will results in the parent hash having a reference of each query result.
For more you can refer to these resources:
http://perldoc.perl.org/perlref.html
http://www.thegeekstuff.com/2010/06/perl-array-reference-examples/
Have a look at DBI documentation.
Here is part of script that does what you want:
my $rows;
while(my $hash_ref = $sth->fetchrow_hashref) {
push #$rows, $hash_ref;
}
You can do this by passing a Slice option to DBI's selectall_arrayref:
my $results = $dbh->selectall_arrayref(
'select id,status from store where last_entry = ?',
{ Slice => {} },
$last_entry
);
This will return an array reference with each row stored in a hash. Note that since hash keys must be unique, you will run into problems if you have duplicate column names in your query.
This is the kind of question that raises an immediate red flag. It's somewhat of an odd request to want a collection (array/array reference) of data structures that are heterogeneous---that's the whole point of a collection. If you tell us what you intend to do with the data rather than what you want the data to look like, we can probably suggest a better solution.
You want something like this:
# select the data as an array of hashes - retured as an arrayref
my $rows = $dbh->selectall_arrayref($the_query, {Slice => {}}, #any_search_params);
# now make the id keys unique
my $i = 1;
foreach my $row ( #$rows) {
# remove each column and assign the value to a uniquely named column
# by adding a numeric suffix
$row->{"ID" . $i} = delete $row->{ID};
$row->{"status" . $i} = delete $row->{status};
$i += 1;
}
Add your own error checking.
So you said "save as a hash," but your example is an array of hashes. So there would be a slightly different method for a hash of hashes.

How to stream repeated fields into bigquery?

I'm using the tabledata().insertAll()
here is some test data I'm trying to insert:
row = {
'insertId': str(i*o),
'json': {
'meterId': i*o,
'erfno': str(i),
'latitude': '123123',
'longitude': '123123',
'address': str(random.randint(1, 100)) + 'foobar street',
'readings': [
{
'read_at': time.time(),
'usage': random.randrange(50, 500),
'account': 'acc' + str(i*o)
}
]
}
}
It gives me the error:
array specified for non-repeated field
I wish to stream (and thus append to the repeated field) one record at a time of the 'readings' repeated field every minute.
You cannot update an existing row. You cannot add to an existing row. You need to rethink this. Don't forget that BigQuery is append only.
You can have repeated fields in rows, but it must be declared as that in your schema.
In your situation, you need to create new rows with every reading. reading can be a record if you want to structure your data like that.
Correct! You should consider flatenning your table, inserting a new row for every new reading.

Best way to compare two hash of hashes?

Right now I have two hash of hashes, 1 that I created by parsing a log file, and 1 that I grab from SQL. I need to compare them to find out if the record from the log file exists in the database already. Right now I am iterating through each element to compare them:
foreach my $i(#record)
{
foreach my $a(#{$data})
{
if ($i->{port} eq $a->{port} and $i->{name} eq $a->{name})
{
print "match found $i->{name}, updating record in table\n";
}
else
{
print "no match found for $tableDate $i->{port} $i->{owner} $i->{name} adding record to table\n";
executeStatement("INSERT INTO client_usage (date, port, owner, name, emailed) VALUES (\'$tableDate\', \'$i->{port}\', \'$i->{owner}\', \'$i->{name}\', '0')");
}
}
}
Naturally, this takes a long time to run through as the database gets bigger. Is there a more efficient way of doing this? Can I compare the keys directly?
You have more than a hash of hashes. You have two lists and each element in each list contains a hash of hashes. Thus, you have to compare each item in the list with each item in the other list. Your algorithm is efficiency is O2 -- not because it's a hash of hashes, but because you're comparing each row in one list with each row in another list.
Is it possible to go through your lists and turn them into a hash that is keyed by the port and name? That way, you go through each list once to create the indexing hash, then go through the hash once to do the comparison.
For example, to create the hash from the record:
my %record_hash;
foreach my $record_item (#record) {
my $name = $record_item->{name};
my $data = $record_item->{data}
my $record_hash{$name:$data} = \$record_item #Or something like this...
}
Next, you'd do the same for your data list:
my %data_hash;
foreach my $data_item (#{$data}) {
my $name = $data_item->{name};
my $data = $data_item->{data}
my $data_hash{$name:$data} = \$data_item #Or something like this...
}
Now you can go through your newly created hash just once:
foreach my $key (keys %record_hash) {
if (exists $data_hash{$key}) {
print "match found $i->{name}, updating record in table\n";
}
else {
print "no match found for $tableDate $i->{port} $i->{owner} $i->{name} adding record to table\n";
executeStatement("INSERT INTO client_usage (date, port, owner, name, emailed) VALUES (\'$tableDate\', \'$i->{port}\', \'$i->{owner}\', \'$i->{name}\', '0')");
}
}
Let's say you have 1000 elements in one list, and 500 elements in the other. Your original algorithm would have to loop 500 * 1000 times (half a million times). By creating an index hash, you have to loop through 2(500 + 1000) times (about 3000 times).
Another possibility: Since you're already using a SQL database, why not do the whole thing as a SQL query. That is, don't fetch the records. Instead, go through your data, and for each data item, fetch the record. If the record exists, you update it. If not, you create a new one. That maybe even faster because you're not turning the whole thing into a list in order to turn it into a hash.
There's a way to tie SQL databases directly to hashes. That might be a good way to go too.
Are you using Perl-DBI?
How about using Data::Difference:
use Data:Difference qw(data_diff);
my #diff = data_diff(\%hash_a, \%hash_b);
#diff = (
{ 'a' => 'value', 'path' => [ 'data' ] }, # exists in 'a' but not in 'b'
{ 'b' => 'value', 'path' => [ 'data' ] }, # exists in 'b' not in 'a'
);