find the owner of EC2 instance by Athena and CloudTrail

find the owner of EC2 instance by Athena and CloudTrail - sql

In order to know the owner of each EC2 instance, I query the cloudtrail logs stored in S3 by Athena.
I have a table in Athena with the following stucture:
CREATE EXTERNAL TABLE cloudtrail_logs (
eventversion STRING,
useridentity STRUCT<
type:STRING,
principalid:STRING,
arn:STRING,
accountid:STRING,
invokedby:STRING,
accesskeyid:STRING,
userName:STRING,
sessioncontext:STRUCT<
attributes:STRUCT<
mfaauthenticated:STRING,
creationdate:STRING>,
sessionissuer:STRUCT<
type:STRING,
principalId:STRING,
arn:STRING,
accountId:STRING,
userName:STRING>>>,
eventtime STRING,
eventsource STRING,
eventname STRING,
awsregion STRING,
sourceipaddress STRING,
useragent STRING,
errorcode STRING,
errormessage STRING,
requestparameters STRING,
responseelements STRING,
additionaleventdata STRING,
requestid STRING,
eventid STRING,
resources ARRAY<STRUCT<
ARN:STRING,
accountId:STRING,
type:STRING>>,
eventtype STRING,
apiversion STRING,
readonly STRING,
recipientaccountid STRING,
serviceeventdetails STRING,
sharedeventid STRING,
vpcendpointid STRING
)
PARTITIONED BY (account string, region string, year string)
ROW FORMAT SERDE 'com.amazon.emr.hive.serde.CloudTrailSerde'
STORED AS INPUTFORMAT 'com.amazon.emr.cloudtrail.CloudTrailInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION 's3://<BUCKET>/AWSLogs/';
I want to find the identity of the user who launch an EC2 instances so I need to parse the field responseelements and only get the rows with responseelements that has a particular instanceID.
the field responseelements is like this:
{
"requestId":"cab34472-31cc-44cd-ae32-a84077e55cb6",
"reservationId":"r-05964c8549788ac50",
"ownerId":"xxxxxxxxxx",
"groupSet":{},
"instancesSet":{
"items":[
{"instanceId":"i-043543cb4c12",
"imageId":"ami-078df974",
"instanceState":{"code":0,"name":"pending"},
"privateDnsName":"ip-444444.eu-west-1.compute.internal",
"keyName":"key-dev","amiLaunchIndex":0,"productCodes":{},
"instanceType":"t2.large",
"launchTime":1488438050000,
"placement":{"availabilityZone":"eu-west-1b","tenancy":"default"},
"monitoring":{"state":"pending"},
"subnetId":"subnet-d8fffff",
"vpcId":"vpc-444435",
"privateIpAddress":"10.0.42.49",
"stateReason":{"code":"pending","message":"pending"},
"architecture":"x86_64",
"rootDeviceType":"ebs",
"rootDeviceName":"/dev/xvda",
"blockDeviceMapping":{},
"virtualizationType":"hvm",
"hypervisor":"xen",
"clientToken":"c6e53004-c561-437d-a642-196489ff297c_subnet-fffffffff",
"groupSet":{"items":[{"groupId":"sg-64878700","groupName":"MetamSecurityGroup"}]},
"sourceDestCheck":true,
"networkInterfaceSet":{
"items":[
{"networkInterfaceId":"eni-b16b66f0",
"subnetId":"subnet-dffffff",
"vpcId":"vpc-50fffff35",
"ownerId":"xxxxxxxx",
"status":"in-use",
"macAddress":"fdsfdsfsdfqdsf",
"privateIpAddress":"10.0.42.34234213",
"privateDnsName":"ip-1dddddd.eu-west-1.compute.internal",
"sourceDestCheck":true,
"groupSet":{"items":[{"groupId":"sg-64878700","groupName":"MetamSecurityGroup"}]},
"attachment":{"attachmentId":"eni-attach-45619121","deviceIndex":0,"status":"attaching","attachTime":1488438050000,"deleteOnTermination":true},
"privateIpAddressesSet":{"item":[{"privateIpAddress":"10ffffff","privateDnsName":"ip-ffffff.eu-west-1.compute.internal","primary":true}]},
"ipv6AddressesSet":{},
"tagSet":{}}]}
,"iamInstanceProfile":{"arn":"arn:aws:iam::xxxxx:instance-profile/infra-EC2InstanceProfile-1D59C5YR0LIYJ","id":"eeeeeeeeeeeeeeeeee"},
"ebsOptimized":false}
]
},
"requesterId":"226008221399"
}
This is my query that I tried:
SELECT DISTINCT eventsource, eventname, useridentity.userName, eventtime, json_extract(responseelements, '$.instanceId') as instance_id
FROM cloudtrail_logs
WHERE account = 'xxxxxxxxxxxxxxx'
AND eventname = 'RunInstances';
but this gives instance_id as an empty column.
How to properly get only instance_id from the resposneelement?

I found the right query to find the owner of an ECS instance. That might help someone!
SELECT DISTINCT eventsource, eventname, useridentity.userName, eventtime, json_extract(responseelements, '$.instancesSet.items[0].instanceId') as instance_id
FROM cloudtrail_logs
WHERE account = 'xxxxxxx'
AND eventname = 'RunInstances'
AND responseelements LIKE '%i-3434ecb4c12%'
;

Great answer!!! I was searching forever! Thank you! One small change. In my case account was not needed. It threw this error:
SYNTAX_ERROR: line 3:7: Column 'account' cannot be resolved
Here is how I run it:
SELECT DISTINCT eventsource,
eventname,
useridentity.userName,
eventtime,
json_extract(responseelements, '$.instancesSet.items[0].instanceId') as instance_id
FROM <myCloudLogTable>
WHERE eventname = 'RunInstances'
AND responseelements LIKE '<myinstanceId>';

Related

error in spark_sql with "value as is not a member of object"

I have the case class like:
Hotel (id: String , name: String, cc1: String, city_preferred: String)
the csv is like:
id
name
cc1
city_hotel
1949417
apart A
pl
Sopot
2023862
apart B
es
Granada
1967734
apart C
hr
Ici
and I have to show the id, name and country but only with the hotels in Spain
I try:
val hotelsSpain = Hotel.as("Hoteles")
.select("cc1", "id", "name", "city_hotel")
.where("cc1" == "es")
display(hotelsSpain)
but the code return me
error: value as is not a member of object Hotel
val hotelsSpain = Hotel.as("Hoteles")
I think it is a little error but I don't see it. Thank you

In your schema, the column is defined as "city_prefered" (sic), but your code contains "city_hotel"
Looks to me that this error is simply telling you that "city_hotel" doesn't exist on this object

How can I insert data into this structure in BigQuery?

In BigQuery, I want to insert some data into this very simple data structure:
Field Type Mode
id STRING NULLABLE
policies RECORD REPEATED
s RECORD NULLABLE
something STRING NULLABLE
riskTypes RECORD REPEATED
code STRING NULLABLE
In the light of my previous question I would expect the syntax to be as follows:
UPDATE `tablename` SET policies = ARRAY_CONCAT(
policies, [
struct<s struct<something STRING, riskTypes ARRAY<struct<code STRING>>>
("example something", [("example description")])
]
)
WHERE id = 'Moose';
But this gives an error:
Unexpected "[" (before the "example description")

Below should work
update `tablename` set policies = policies || [
struct<s struct<something string, riskTypes array<struct<code string>> >>
(struct('example something' as something, [struct('example description 1' as code), struct('example description 2')]))
]
where id = 'Moose'

my mybatis not work for Oracle but tested it work in SQLDeveloper

I'm new to Mybatis, I create a mapper to get the timestamp I want. but only one period of time it works, but later it just returns null. and for another function, I using SimpleDateFormat to format date into "dd-MMM-yyyy hh.mm.ss.SSS a", so I can insert it into timestamp(3) of the Oracle database.
Below is my code, any suggestions?
#Select("SELECT MAX(TIME_STARTED) FROM METADATA_SYNC_LOG WHERE STATUS = 'SUCCESSFUL'")
Date getLastSyncStartedTime();
#Insert("INSERT INTO METADATA_SYNC_LOG (ID, SERVICE, HOST, TIME_STARTED, TIME_FINISHED, STATUS, SYNCRESULT) values (#{ID}, #{SERVICE}, #{HOST}, #{TIME_STARTED}, #{TIME_FINISHED}, #{STATUS}, #{SYNCRESULT})")
void insertIntoMetadata_Sync_log(#Param("ID") String id, #Param("SERVICE") String service, #Param("HOST") String host, #Param("TIME_STARTED") String time_started, #Param("TIME_FINISHED") String time_finished, #Param("STATUS") String status, #Param("SYNCRESULT") String syncresult);
#Update("MERGE INTO SERVICE_METADATA d USING (SELECT #{SERVICE} SERVICE, #{ID} ID, #{SERVICE_ENTITY} SERVICE_ENTITY, #{TIME_CREATED} TIME_CREATED, #{TIME_FINISHED} TIME_FINISHED, #{DETAILS} DETAILS FROM dual) s ON (d.ID = s.ID) WHEN MATCHED THEN UPDATE SET d.SERVICE = s.SERVICE, d.SERVICE_ENTITY = s.SERVICE_ENTITY, d.TIME_CREATED = s.TIME_CREATED, d.TIME_FINISHED = s.TIME_FINISHED, d.DETAILS = s.DETAILS WHEN NOT MATCHED THEN INSERT (SERVICE, ID, SERVICE_ENTITY, TIME_CREATED, TIME_FINISHED, DETAILS) VALUES (s.SERVICE, s.ID, s.SERVICE_ENTITY, s.TIME_CREATED, s.TIME_FINISHED, s.DETAILS)")
void insertOrUpdateIntoService_Metadata(#Param("SERVICE") String service, #Param("ID") String id, #Param("SERVICE_ENTITY") String service_entity, #Param("TIME_CREATED") String time_created, #Param("TIME_FINISHED") String time_finished, #Param("DETAILS") String details);

micronaut-data and composite key mapping

I have an entity with a composite key
#Entity
data class Page(
#EmbeddedId
val pageId : PageId,
...
)
#Embeddable
data class PageId (
#Column(name = "id")
val id: UUID,
#Column(name = "is_published")
val isPublished: Boolean
)
But I need to respect the existing column names in the db table, which are 'id' and 'is_published'
But querying the db with a JDBCRepository I get the error:
SQL Error executing Query: ERROR: column page_.page_id_published does
not exist
Is there any way that I can map the columns correctly?

Try and error led me to the answer, somehow Micronaut does not like a Boolean to be named 'isPublished', when I rename it to 'published' it works fine:
data class PageId (
#MappedProperty(value = "id")
val id: UUID,
#MappedProperty(value = "is_published")
val published: Boolean)

Running Plain SQL dynamically in Quill using infix fails with wrong query syntax during runtime

I want to construct my query in plain SQL and then run it using Quill, I am using infix operator .
My code is like this.
case class Employee(name: String, age: String, company_name: String)
case class Company(name: String, pin_code: String)
case class CombinedEmployee(employee_age: Int,
employee_name: Option[String],
company_name: String,
pin: Option[String])
val sql = "SELECT t1.age AS employee_age, t1.name AS employee_name, t2.name AS company_name, t2.pin as pin FROM employee t1 JOIN company t2 ON t1.company_name = t2.name"
// it can be anything which is calculated dynamically.
def rawQuery = quote { (myQuery: String) =>
infix"""$myQuery"""
.as[Query[CombinedEmployee]]
}
and I am calling
ctx.translate(rawQuery(lift(sql)))
this prints
SELECT x.employee_age, x.employee_name, x.company_name, x.pin FROM ('SELECT t1.age AS employee_age, t1.name AS employee_name, t2.name AS company_name, t2.pin as pin FROM employee t1 JOIN company t2 ON t1.company_name = t2.name') AS x
which is a wrong syntax, when I executed this using run method this fails.
Is the way of writing plain SQL correct ?? If no, is there a different way to run Plain SQL and model into case classes ??

This can be solved by adding a # before $
def rawQuery = quote { (myQuery: String) =>
infix"""#$myQuery"""
.as[Query[CombinedEmployee]]
}

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

find the owner of EC2 instance by Athena and CloudTrail - sql

Related

error in spark_sql with "value as is not a member of object"

How can I insert data into this structure in BigQuery?

my mybatis not work for Oracle but tested it work in SQLDeveloper

micronaut-data and composite key mapping

Running Plain SQL dynamically in Quill using infix fails with wrong query syntax during runtime

Categories

Resources