How to avoid Big Query nesting date values in JSON? - google-bigquery

I'm using Standard SQL with Big Query and everything is working fine, except my dates are in a nested structure and I have no idea why Big Query is doing that.
Here's my query:
SELECT
DATETIME(salesData.date_utc, "EST") AS DateEST,
salesData.serial_no AS MachineID
FROM
sales.sales_all AS salesData
WHERE salesData.date_utc > "2018-05-26T05:00:00" AND salesData.date_utc
< "2018-05-27T04:59:59"
ORDER BY salesData.date_utc DESC
When I'm downloading the results as JSON it's all fine:
{"DateEST":"2018-05-26T23:57:58","MachineID":"1708FB0000009-B"}
{"DateEST":"2018-05-26T23:52:07","MachineID":"1710FB0000034-B"}
But if I'm using Google Cloud Functions and pull the data, it results in a nested JSON.
[
{
"DateEST": {
"value": "2018-05-26T23:57:58"
},
"MachineID": "1708FB0000009-B"
}, ...
Here's part of my Cloud Function code:
const options = {
query: sqlQuery,
useLegacySql: false, // Use standard SQL syntax for queries.
};
bigquery
.query(options)
.then(results => {
const rows = results[0];
response.json(rows);
})
.catch(err => {
console.error('ERROR:', err);
response.send(500);
});

This issue was solved by casting the DATETIME object as a string:
CAST(DATETIME(salesData.date_utc, "EST") as STRING) as DateEST

Related

How to make complex nested where conditions with typeORM?

I am having multiple nested where conditions and want to generate them without too much code duplication with typeORM.
The SQL where condition should be something like this:
WHERE "Table"."id" = $1
AND
"Table"."notAvailable" IS NULL
AND
(
"Table"."date" > $2
OR
(
"Table"."date" = $2
AND
"Table"."myId" > $3
)
)
AND
(
"Table"."created" = $2
OR
"Table"."updated" = $4
)
AND
(
"Table"."text" ilike '%search%'
OR
"Table"."name" ilike '%search%'
)
But with the FindConditions it seems not to be possible to make them nested and so I have to use all possible combinations of AND in an FindConditions array. And it isn't possible to split it to .where() and .andWhere() cause andWhere can't use an Object Literal.
Is there another possibility to achieve this query with typeORM without using Raw SQL?
When using the queryBuilder I would recommend using Brackets
as stated in the Typeorm doc: https://typeorm.io/#/select-query-builder/adding-where-expression
You could do something like:
createQueryBuilder("user")
.where("user.registered = :registered", { registered: true })
.andWhere(new Brackets(qb => {
qb.where("user.firstName = :firstName", { firstName: "Timber" })
.orWhere("user.lastName = :lastName", { lastName: "Saw" })
}))
that will result with:
SELECT ...
FROM users user
WHERE user.registered = true
AND (user.firstName = 'Timber' OR user.lastName = 'Saw')
I think you are mixing 2 ways of retrieving entities from TypeORM, find from the repository and the query builder. The FindConditions are used in the find function. The andWhere function is use by the query builder. When building more complex queries it is generally better/easier to use the query builder.
Query builder
When using the query build you got much more freedom to make sure the query is what you need it to be. With the where you are free to add any SQL as you please:
const desiredEntity = await connection
.getRepository(User)
.createQueryBuilder("user")
.where("user.id = :id", { id: 1 })
.andWhere("user.date > :date OR (user.date = :date AND user.myId = :myId)",
{
date: specificCreatedAtDate,
myId: mysteryId,
})
.getOne();
Note that depending on your used database the actual SQL that you use here needs to be compatible. With that could also come a possible draw back of using this method. You will tie your project to a specific database. Make sure to read up about the aliases for tables you can set if you are using relations this would be handy.
Repository
You already saw that this is much less comfortable. This is because the find function or more specific the findOptions are using objects to build the where clause. This makes is harder to implement a proper interface to implement nested AND and OR clauses side by side. There for (I assume) they have chosen to split AND and OR clauses. This makes the interface much more declarative and means the you have to pull your OR clauses to the top:
const desiredEntity = await repository.find({
where: [{
id: id,
notAvailable: Not(IsNull()),
date: MoreThan(date)
},{
id: id,
notAvailable: Not(IsNull()),
date: date
myId: myId
}]
})
I cannot imagin looking a the size of the desired query that this code would be very performant.
Alternatively you could use the Raw find helper. This would require you to rewrite your clause per field, since you will only get access to the one alias at a time. You could guess the column names or aliases but this would be very poor practice and very unstable since you cannot directly control this easily.
if you want to nest andWhere statements if a condition is meet here is an example:
async getTasks(filterDto: GetTasksFilterDto, user: User): Promise<Task[]> {
const { status, search } = filterDto;
/* create a query using the query builder */
// task is what refer to the Task entity
const query = this.createQueryBuilder('task');
// only get the tasks that belong to the user
query.where('task.userId = :userId', { userId: user.id });
/* if status is defined then add a where clause to the query */
if (status) {
// :<variable-name> is a placeholder for the second object key value pair
query.andWhere('task.status = :status', { status });
}
/* if search is defined then add a where clause to the query */
if (search) {
query.andWhere(
/*
LIKE: find a similar match (doesn't have to be exact)
- https://www.w3schools.com/sql/sql_like.asp
Lower is a sql method
- https://www.w3schools.com/sql/func_sqlserver_lower.asp
* bug: search by pass where userId; fix: () whole addWhere statement
because andWhere stiches the where class together, add () to make andWhere with or and like into a single where statement
*/
'(LOWER(task.title) LIKE LOWER(:search) OR LOWER(task.description) LIKE LOWER(:search))',
// :search is like a param variable, and the search object is the key value pair. Both have to match
{ search: `%${search}%` },
);
}
/* execute the query
- getMany means that you are expecting an array of results
*/
let tasks;
try {
tasks = await query.getMany();
} catch (error) {
this.logger.error(
`Failed to get tasks for user "${
user.username
}", Filters: ${JSON.stringify(filterDto)}`,
error.stack,
);
throw new InternalServerErrorException();
}
return tasks;
}
I have a list of
{
date: specificCreatedAtDate,
userId: mysteryId
}
My solution is
.andWhere(
new Brackets((qb) => {
qb.where(
'userTable.date = :date0 AND userTable.type = :userId0',
{
date0: dates[0].date,
userId0: dates[0].type,
}
);
for (let i = 1; i < dates.length; i++) {
qb.orWhere(
`userTable.date = :date${i} AND userTable.userId = :userId${i}`,
{
[`date${i}`]: dates[i].date,
[`userId${i}`]: dates[i].userId,
}
);
}
})
)
That will produce something similar
const userEntity = await repository.find({
where: [{
userId: id0,
date: date0
},{
id: id1,
userId: date1
}
....
]
})

Parsing JSON in Snowflake

I'm trying to parse a the below nested JSON in Snowflake using the latteral function in Snowflake but I wanted to each nested column in "GoalTime" to show up as a column. For example,
GoalTime_InDoorOpen
2020-03-26T12:58:00-04:00
GoalTime_InLastOff
null
GoalTime_OutStartBoarding
2020-03-27T14:00:00-04:00
"GoalTime": [
{
"GoalName": "GoalTime_InDoorOpen",
"GoalTime": "2020-03-26T12:58:00-04:00"
},
{
"GoalName": "GoalTime_InLastOff"
},
{
"GoalName": "GoalTime_InReadyToTow"
},
{
"GoalName": "GoalTime_OutTowAtGate"
},
{
"GoalName": "GoalTime_OutStartBoarding",
"GoalTime": "2020-03-27T14:00:00-04:00"
},
or if you have many rows (what appear to be flights) and thus you need to columns per flight this code be what you are after
with data as (
select flight_code, parse_json(json) as json from values ('nz101','{GoalTime:[{"GoalName": "GoalA", "GoalTime": "2020-03-26T12:58:00-04:00"}, {"GoalName": "GoalB"}]}'),
('nz201','{GoalTime:[{"GoalName": "GoalA"}, {"GoalName": "GoalB", "GoalTime": "2020-03-26T12:58:00-02:00"}]}')
j(flight_code, json)
), unrolled as (
select d.flight_code, f.value:GoalName as goal_name, f.value:GoalTime as goal_time
from data d,
lateral flatten (input => json:GoalTime) f
)
select *
from unrolled
pivot(min(goal_time) for goal_name in ('GoalA', 'GoalB'))
order by flight_code;
it gives the results:
FLIGHT_CODE 'GoalA' 'GoalB'
nz101 "2020-03-26T12:58:00-04:00" null
nz201 null "2020-03-26T12:58:00-02:00"
create or replace function JSON_STRING()
returns string
language javascript
as
$$
return `
[
{
"GoalName": "GoalTime_InDoorOpen",
"GoalTime": "2020-03-26T12:58:00-04:00"
},
{
"GoalName": "GoalTime_InLastOff"
},
{
"GoalName": "GoalTime_InReadyToTow"
},
{
"GoalName": "GoalTime_OutTowAtGate"
},
{
"GoalName": "GoalTime_OutStartBoarding",
"GoalTime": "2020-03-27T14:00:00-04:00"
}
]
`;
$$;
select value:GoalName::string as GoalName, value:GoalTime::timestamp as GoalTime
from lateral flatten(input => parse_json(JSON_STRING()));
-- See how the lateral flatten combination works on a JSON variant:
select * from lateral flatten(input => parse_json(JSON_STRING()));
I wrote this to run in any Snowflake worksheet, no tables needed. The function on top simply allows the JSON to be written as a multi-line string in the SQL statement below it. It has no other use than representing a string holding your JSON.
Step 1 is to PARSE_JSON, which converts a string into a variant data type formatted as a JSON object.
Step 2 is the lateral flatten. If you do a select star on that, it will return a number of columns. One of them is "value".
Step 3 is to extract the properties you want using single : notation for the property name and dots to traverse down the nodes from there (if there are any).
Step 4 is to cast the property to the data type you want using double :: notation. This is especially important if you're doing comparisons on the column particularly in join keys.
Note that there's a slight invalid part of the JSON that did not allow it to parse. In the top level the array had a property, which did not parse. I removed that to allow parsing.
Probably close to what you seek is using a standard SQL UNION statement.
Given the following are true to recreate the solution:
Created a table 'JSON_GOALS' with one column for raw JSON called, GOALS_RAW
You have loaded JSON data into a table as the raw JSON, with compliant JSON object array syntax, and a parent, GoalTimeGroup, ex: {[{}]}, so
{
"GoalTimeGroup": [{
"GoalName": "GoalTime_InDoorOpen",
"GoalTime": "2020-03-26T12:58:00-04:00"
},
{
"GoalName": "GoalTime_InLastOff"
},
{
"GoalName": "GoalTime_InReadyToTow"
},
{
"GoalName": "GoalTime_OutTowAtGate"
},
{
"GoalName": "GoalTime_OutStartBoarding",
"GoalTime": "2020-03-27T14:00:00-04:00"
}
]
}
Doing so allows you to write a fairly standard JSON retrieve in Snowflake with the following syntax:
SELECT GOALS_RAW:GoalTimeGroup[0].GoalName, GOALS_RAW:GoalTimeGroup[1].GoalName, GOALS_RAW:GoalTimeGroup[2].GoalName
FROM JSON_GOALS
UNION
SELECT GOALS_RAW:GoalTimeGroup[0].GoalTime, GOALS_RAW:GoalTimeGroup[1].GoalTime, GOALS_RAW:GoalTimeGroup[2].GoalName
FROM JSON_GOALS
;
This gives you closer to the answer you are looking for and seems to provide a simpler solution. You can also control how many rows you'd want based on your JSON object attributes for each GOAL object.
Recommendations to enhance this would be to create a function that could detect the depth of each nested element and perhaps auto generate the indexes for 'n' number of columns.
The library below provides a method called "ExecuteAll" which one of the params is "tags", so if you provide an array of tags and values, all of them will be parsed and validated plus keeping the features of the sql injection protection from Snowflake.
snowflake-multisql

How to use query parameter as col name on sequelize query

So, I have an Express server running Sequelize ORM. Client will answer some questions with radio button options, and the answers is passed as query parameters to the URL, so I can make a GET request to server.
The thing is: my req.query values are supposed to be the column names for my database table. I want to know if it's possible to get the response from the database using Sequelize, and passing the parameters as the column name of the table.
async indexAbrigo(req, res) {
try {
let abrigos = null
let type = req.query.type // type = 'periodoTEMPORARIO'
let reason = req.query.reason // reason = 'motivoRUA'
abrigos = await Abrigos.findAll({
attributes: ['idABRIGOS'],
where: {
//I want type and reason to be the parameters that I got from client
type: 'S',
reason: 'S'
}
})
}
res.send(abrigos)
}
This is not working, the result of the query is something like
(SELECT `idABRIGOS` FROM Abrigos WHERE `type` = `S` AND `reason` = `S`)
Instead, I need that type and reason get translated to their values, and these values will be passed to the SQL. Is it possible to do with Sequelize? Or is it even possible with any ORM for Node/Express?
Thanks.
Yes, you can do it with computed property names. It would look like this:
where: {
[type]: 'S',
[reason]: 'S'
}
Your node.js version must be compatible with ES6.
Thank you - I am passing parameters right from req.body and this worked perfectly for me so I am posting it in the event that it may save someone time.
app.post('/doFind', (req, res) => {
Abrigos.findAll(
{[req.body.field]: req.body.newVal},
{returning: true, where: {id: req.body.id}}
)
.then( () => {
res.status(200).end()
})
.catch(err => {
console.log("Err: " + err)
res.status(404).send('Error attempting to update database').end()
})
})

Search Query Not Handling Large Number of Users

We have an asp.net-core website which handles users search as follows:
public async Task<ICollection<UserSearchResult>> SearchForUser(string name, int page)
{
return await db.ApplicationUsers.Where(u => u.Name.Contains(name) && !u.Deleted && u.AppearInSearch)
.OrderByDescending(u => u.Verified)
.Skip(page * recordsInPage)
.Take(recordsInPage)
.Select(u => new UserSearchResult()
{
Name = u.Name,
Verified = u.Verified,
PhotoURL = u.PhotoURL,
UserID = u.Id,
Subdomain = u.Subdomain
}).ToListAsync();
}
The query translates to something similar to the following:
SELECT [t].[Name], [t].[Verified], [t].[PhotoURL], [t].[Id], [t].[Subdomain] FROM (SELECT [u0].* FROM [AspNetUsers] AS [u0] WHERE (((CHARINDEX('khaled', [u0].[Name]) > 0) OR ('khaled' = N'')) AND ([u0].[Deleted] = 0)) AND ([u0].[AppearInSearch] = 1) ORDER BY [u0].[Verified] DESC OFFSET 10 ROWS FETCH NEXT 10 ROWS ONLY ) AS [t]
In Client-Side we use typeahead and bloodhound as follows:
engine = new Bloodhound({
identify: function (user) {
return user.UserID;
},
queryTokenizer: Bloodhound.tokenizers.whitespace,
datumTokenizer: Bloodhound.tokenizers.obj.whitespace('name'),
dupDetector: function (a, b) { return a.UserID === b.UserID; },
remote: {
cache: false,
url: '/account/Search?name=%QUERY&page=0',
wildcard: '%QUERY'
}
});
and we configure typeahead as follows:
$('#demo-input').typeahead(
{
hint: $('.Typeahead-hint'),
menu: $('.Typeahead-menu'),
minLength: 3,
classNames:
{
open: 'is-open',
empty: 'is-empty',
cursor: 'is-active',
suggestion: 'Typeahead-suggestion',
selectable: 'Typeahead-selectable'
}
},
{
source: engineWithDefaults,
displayKey: 'name',
templates:
{
suggestion: template,
empty: empty,
footer: all
},
limit: 5
})
The search works just find on localhost and the query runs great as a sql query.
I have also created an index on Verified and cut the speed to 1 second or less.
Our website has millions of registered users and the problem is that as soon as we make search available for all users the DTU percentage on Azure goes to 100% and the queries timeout.
We also have a redis cache to speed-up similar queries but this didn't help us with this issue.
Your support is appreciated :)
It's quite likely to be u.Name.Contains(name) i.e. CHARINDEX('khaled', [u0].[Name]) > 0 which will have to scan the entire table or, at best, the index. That will be slow and there's not much you can do about it.
If you have a large bias to deleted or appearInSearch you might be able to use a conditional index but these types of searches are notoriously slow. You will need some special constructs to make this work.

Getting SQL MIN() and MAX() from DBIx::Class::ResultSetColumn in one query

I want to select the MIN() and MAX() of a column from a table. But instead of querying the database twice I'd like to solve this in just one query.
I know I could do this
my $col = $schema->result_source("Birthday")->get_column("birthdate");
my $min = $col->min();
my $max = $col->max();
But it would query the database twice.
The only other solution I found is quite ugly, by messing around with the select and as attributes to search(). For example
my $res = $rs->search({}, {
select => [ {min => "birthdate"}, {max => "birthdate"},
as => [qw/minBirthdate maxBirthdate/]
});
say $res->get_column("minBirthdate")->first() . " - " . $res->get_column("maxBirthdate")->first();
Which produces this - my wanted SQL
SELECT MIN(birthdate), MAX(birthdate) FROM birthdays;
Is there any more elegant way to get this done with DBIx::Class?
And to make it even cooler, is there a way to respect the inflation/deflation of the column?
You can use columns as a shortcut to combine select and as attributes as such:
my $res = $rs->search(undef, {
columns => [
{ minBirthdate => { min => "birthdate" } },
{ maxBirthdate => { max => "birthdate" } },
]
});
Or, if you prefer more control over the SQL, use string refs, which can help with more complex calculations:
my $res = $rs->search(undef, {
columns => [
{ minBirthdate => \"MIN(birthdate)" },
{ maxBirthdate => \"MAX(birthdate)" },
]
});
Now, if you really want to clean it up a bit, I highly recommend DBIx::Class::Helpers, which allows you to write it as such:
my $minmax = $rs->columns([
{minBirthdate=>\"MIN(birthdate)"},
{maxBirthdate=>\"MAX(birthdate)"},
])->hri->single;
say "$minmax->{minBirthdate} - $minmax->{maxBirthdate}";