This is how PyYAML behaves on my machine:
>>> plan = {'Business Plan': ['Collect Underpants', '?', 'Profit']}
>>> print(yaml.dump(plan))
Business Plan: [Collect Underpants, '?', Profit]
What I want instead is this output (both is valid YAML):
Business Plan:
- Collect Underpants
- '?'
- Profit
Is there some kind of option that would do it?
You need to add the 'default_flow_style=False' argument to the call:
In [6]: print(yaml.dump(plan, default_flow_style=False))
Business Plan:
- Collect Underpants
- '?'
- Profit
Related
I need to strip the characters after the third '-', after the first "(", and after the first '/' , and keep the result in a new column, keepcat.
violation_code violation_description keepcat
ticket_id
22056 9-1-36(a) Failure of owner to obtain certificate of compliance 9-1-36
27586 61-63.0600 Failed To Secure Permit For Lawful Use Of Building 61-63.0600
18738 61-63.0500 Failed To Secure Permit For Lawful Use Of Land 61-63.0500
18735 61-63.0100 Noncompliance/Grant Condition/BZA/BSE 61-63.0100
23812 61-81.0100/32.0066 Open Storage/ Residential/ Vehicles 61-81.0100/32.0066
26686 61-130.0000/130.0300 Banner/ Signage/ Antenna 61-130.0000/130.0300
325555 9-1-43(a) - (Structu Fail to comply with an Emergency 9-1-43
I have managed to delete the dashes ("-") and the brackets ("(") with this:
df['keepcat']=df['violation_code'].apply(lambda x: "-".join(x.split("-")[:3]) and x.split('(')[0].strip())
however, when I am adding "/" it does not delete the slashes...
I have tried
df['violation_code'].apply(lambda x: "-".join(x.split("-")[:3]) and x.split('(')[0].strip()) and x.split('/')[0].strip() )
thank you.
Does it work if you parse the conditions separately:
df['keepcat'] = df['violation_code'].apply(lambda x: "-".join(x.split("-")[:3]))
df['keepcat'] = df['keepcat'].apply(lambda x: x.split('(')[0].strip())
df['keepcat'] = df['keepcat'].apply(lambda x: x.split('/')[0].strip())
I have some results in one of my tables and the results vary, each; represents multiple entries in one column which I need to split out.
Here is my SQL and the results:
select REGEXP_COUNT(value,';') as cnt,
description
from mytable;
1 {Managed By|xBoss}{xBoss xBoss Number|X0910505569}{Time
Requested|2009-04-15 20:47:11.0}{Time Arrived|2009-04-15 21:46:11.0};
1 {Managed By|Modern Management}{xBoss Number|}{Time Requested|2009-04-
16 14:01:29.0}{Time Arrived|2009-04-16 14:44:11.0};
2 {Managed By|xBoss}{xBoss Number|X091480092}{Time Requested|2009-05-28
08:58:41.0}{Time Arrived|};{Managed By|Jims Allocation}{xBoss xBoss
Number|}{Time Requested|}{Time Arrived|};
Desired output:
R1:
Managed By: xBoss
Time Requested:2009-10-19 07:53:45.0
Time Arrived: 2009-10-19 07:54:46.0
R2:
Managed By:Own Arrangements
Number: x5876523
Time Requested: 2009-10-19 07:57:46.0
Time Arrived:
R3:
Managed By: xBoss
Time Requested:2009-10-19 08:07:27.0
select
SPLIT_PART(description, '}', 1),
SPLIT_PART(description, '}', 2),
SPLIT_PART(description, '}', 3),
SPLIT_PART(description, '}', 4),
SPLIT_PART(description, '}', 5)
as description_with_tag from mytable;
This is ok when the count is 1, but when there are multiple ; in the description it doesn't give me the results.
Is it possible to put this into an array based on the count?
First, it's worth pointing out that data in this type of format cannot take advantage of all the benefits that Redshift can offer. Amazon Redshift is a columnar database that can provide amazing performance when data is stored in appropriate columns. However, selecting specific text from a text field will always perform poorly.
Therefore, my main advice would be to pre-process the data into normal rows and columns so that Redshift can provide you the best capabilities.
However, to answer your question, I would recommend making a Scalar User-Defined Function:
CREATE FUNCTION f_extract_curly (s TEXT, key TEXT)
RETURNS TEXT
STABLE
AS $$
# List of items in {brackets}
items = s[1:-1].split('}{')
# Dictionary of Key|Value from items
entries = {i.split('|')[0]: i.split('|')[1] for i in items}
# Return desired value
return entries.get(key, None)
$$ LANGUAGE plpythonu;
I loaded sample data with:
CREATE TABLE foo (
description TEXT
);
INSERT INTO foo values('{Managed By|xBoss}{xBoss xBoss Number|X0910505569}{Time Requested|2009-04-15 20:47:11.0}{Time Arrived|2009-04-15 21:46:11.0};');
INSERT INTO foo values('{Managed By|Modern Management}{xBoss Number|}{Time Requested|2009-04-16 14:01:29.0}{Time Arrived|2009-04-16 14:44:11.0};');
INSERT INTO foo values('{Managed By|xBoss}{xBoss Number|X091480092}{Time Requested|2009-05-28 08:58:41.0}{Time Arrived|};{Managed By|Jims Allocation}{xBoss xBoss Number|}{Time Requested|}{Time Arrived|};');
Then I tested it with:
SELECT
f_extract_curly(description, 'Managed By'),
f_extract_curly(description, 'Time Requested')
FROM foo
and got the result:
xBoss 2009-04-15 20:47:11.0
Modern Management 2009-04-16 14:01:29.0
xBoss
It doesn't know how to handle lines that have the same field specified twice (with semi-colons between). You did not provide enough sample input and output lines for me to figure out what you wanted in such situations, but feel free to tweak the code for your requirements.
There is no array data type in Redshift. There are 2 options:
1) First split_part by ';', then union results separately for every index of the first split_part output, then split_part results by '}' and finally get what you need.
2) Create a Python UDF and process these strings with Python. I guess this is the best solution for your use case.
3) Transform your data outside Redshift. From your data structure it seems like it's much better to process it before copying to Redshift, unnesting the arrays into rows and extracting keys from your objects into columns.
I have a DQL query object which we've implemented (copying a legacy application). The output has to match the legacy verbatim.
The static fields worked like a charm - but now we've encountered more complex computed fields, such as:
IF(
wo.date_approved = 0,
0,
IF(
ship.date_shipped > wo.date_approved,
ROUND(
IF(
(ship.date_shipped - wo.date_approved) > wo.time_inactive,
(ship.date_shipped - wo.date_approved) - wo.time_inactive,
ship.date_shipped - wo.date_approved
) / 86400, 2),
0
)
) AS TAT,
This is not possible to express using the query builder/DQL. I had hoped to possibly adjust the query right before execution (after parameters have been bound but before execution).
Using a placeholder or similar I would search and replace that with the series of computed fields...
I can't figure out a way to make this happen?!?! :o
Alex
I have an SQL Server ALM project. I need a query that extracts all requirements with their full path, which can up to four levels including the requirement name.
Since I'm on SQL Server I'm unable to use the tools supplied by Oracle (which make this quite easy). The query must run from within ALM.
All I have so far is this:
SELECT distinct RQ_REQ_ID AS "Int Req ID",
REQ.RQ_REQ_NAME AS "Req ID",
REQ.RQ_REQ_REVIEWED as "Req Status",
REQ.RQ_REQ_STATUS AS "Req Coverage"
FROM REQ
WHERE RQ_TYPE_ID != 1
ORDER BY RQ_REQ_NAME
Can anyone please complete the statement so it would contain the full requirement path?
Thanks
Okay, I'm at work, its the holidays and I've finished up my projects, and yeah I've seen this as an oracle query and there's some simplicity but I worked in SQL for a bit. This is a total hack query and its written in oracle but the concept is the same.
SELECT distinct RQ_REQ_ID AS "Int Req ID",
r.RQ_REQ_NAME AS "Req ID",
r.RQ_REQ_REVIEWED as "Req Status",
r.RQ_REQ_STATUS AS "Req Coverage",
r.rq_req_path,
(select rq_req_name from req where rq_req_path = substr(r.rq_req_path, 1, 3))
|| '/' ||
(select rq_req_name from req where rq_req_path = substr(r.rq_req_path, 1, 6))
|| '/' ||
(select rq_req_name from req where rq_req_path = substr(r.rq_req_path, 1, 9))
FROM REQ r
WHERE RQ_TYPE_ID != 1
ORDER BY RQ_REQ_NAME
You'll want & instead of || and you may have to do some cast/convert operations on the return. The reason I thought of doing it this way is that crawling the rq_req_path and iterating down is kind of...frowned upon in SQL Server.
What I did was found the max length of RQ_REQ_PATH and just added n concatenations. Afterwards, it would be easy to strip out the extra /.
I'm totally positive there's a better, cleaner way of doing it but in case anyone else is looking at this, this should be a starting point and if its a one-off report, this works fine.
Im trying to do a small query to convert a string :
Doing this query here:-
Select (120 * POWER(CAST(2 AS BIGINT), 32) + 87) As Test
will give me the result of :
515396075607
Now, I want to convert the "515396075607" into a query and the result should give me 120..
Its like back and forth... like Inch -> CM / CM -> Inch
Any thought?
The inverse of the function is:
select exp(log(2) * (log(515396075607 - 87)/log(2) - 32))
This would more easily be expressed as:
select exp2(log2(515396075607 - 87) - 32)
However, not all versions of SQL Server allow you to provide the base for logs and exponents.