Initialize SNSEvent.java with SNS Event String - amazon-s3

I'm working with an in-house framework to consume S3Events. The current framework can handle plain S3EventNotifications by getting the events InputStream, and passing it like below...
String inputStream = CharStreams.toString( new InputStreamReader(eventInputStream, StandardCharsets.UTF_8));
S3EventNotification s3EventNotification = S3EventNotification.parseJson(inputStream);
For technical reasons, I now need to update this framework to use SNS Events, which are quite different from a raw JSON perspective. I'm hoping to find an SNSEvent equivelant of S3EventNotification so I can do something like below...
String inputStream = CharStreams.toString( new InputStreamReader(eventInputStream, StandardCharsets.UTF_8));
SNSEvent snsEvent = SomethingGoesHere.parseJson(inputStream);
Is there anything like that out there? I've looked through a few of the aws-java-sdk's as well as the aws-lambda-java-events artifacts out there in Maven world but can't seem to find anything suitable to this. Any ideas?
EDIT With Solution
Ended up settling with the SNSEvent.java object located in the aws-lambda-java-events jar from Maven. I used Gson with a DateTime adapter to handle the DateTime Timestamp within that object as well as the UPPER_CAMEL_CASE naming policy since the SNS Event from amazon is camel cased, seems to work fine!
Gson gson = new GsonBuilder()
.setFieldNamingPolicy(FieldNamingPolicy.UPPER_CAMEL_CASE)
.registerTypeAdapter(DateTime.class, new JsonDeserializer<DateTime>() {
#Override
public DateTime deserialize(JsonElement json, Type typeOfT, JsonDeserializationContext context)
throws JsonParseException {
return new DateTime(json.getAsString());
}
}).create();
SNSEvent snsEvent = gson.fromJson(event, SNSEvent.class);

Related

Use Java8 Stream on JDBCTemplate Results from HIVE

I am using jdbcTemplate to query hive then writing the results to a .csv file. I basically just generate a list of objects then steam the list to write each record to the file.
I will like to stream the results as they coming back from hive and write it to the file instead of wait to get the whole thing then processing it. Can anyone pointing me to the right direction? Thanks!
private List<Avs> queryAvsData(String asSql) {
List<Avs> llistAvs = new ArrayList<Avs>();
List<Map<String, Object>> rows = hiveJdbcTemplate.queryForList(asSql);
Iterator<Map<String, Object>> it = rows.iterator();
while (it.hasNext()) {
Map<String, Object> row = it.next();
Avs laAvs = Avs.builder()
.make((String) row.get("make"))
.model((String) row.get("model"))
.build();
llistAvs.add(laAvs);
}
return llistAvs;
}
It doesn't look like there's a built-in solution, but you can do it. Basically, you wrap the existing functionality in an iterator, and use a spliterator to turn it into a stream. Here's a blog post on the subject:
The code implements Spring’s ResultSetExtractor interface, which is a Single Abstract Method (SAM) interface, allowing the use of a lambda expression to implement it.
The implementation wraps the SQL ResultSet in an iterator, constructs a stream using the Spliterators and StreamSupport utility classes, and applies that to a Function taking a stream of row sets and returning a generic result.
It's possible to stream values from JdbcTemplate. The following example is a service based on Spring Boot 2.4.8.
As, I run into problems (connection leak) using queryForStream then I will put a demo code here just to know that stream must be closed after usage.
import lombok.RequiredArgsConstructor;
import org.springframework.jdbc.core.SingleColumnRowMapper;
import org.springframework.jdbc.core.namedparam.NamedParameterJdbcTemplate;
import org.springframework.stereotype.Service;
import java.util.Map;
import java.util.stream.Stream;
#Service
#RequiredArgsConstructor
public class DataCleaningService {
private final NamedParameterJdbcTemplate jdbcTemplate;
public void doSomeStreaming() {
String nativeQuery = "SELECT string_value FROM my_table WHERE column = :valueToFiler";
Map<String, Object> queryParameters = Map.of("valueToFiler", "my value");
SingleColumnRowMapper<String> stringRowMapper = SingleColumnRowMapper.newInstance(String.class);
try (Stream<String> stringValueStream = jdbcTemplate.queryForStream(nativeQuery, queryParameters, stringRowMapper)) {
stringValueStream.forEach(stringValue -> {
// do the needed action with the value
//..
System.out.printf("My cool value: %s", stringValue);
});
}
}
}

Kafka Streams Code implemented as a library and function called

So, I have created a Kafka Application that basically uses filter function. Now, I have created the jar file of this application.
I want to import this application as a library in some other program and call the method to start the filtering process. Then I also want to stop this filtering using some command or API call. How can I do this??
I have provide the topology builder function below for which I want to convert into a library and call from other application.
public static Topology builderFunc(String id) {
final Serializer<JsonNode> jsonNodeSerializer = new JsonSerializer();
final Deserializer<JsonNode> jsonNodeDeserializer = new JsonDeserializer();
final Serde<JsonNode> jsonNodeSerde = Serdes.serdeFrom(jsonNodeSerializer, jsonNodeDeserializer);
final StreamsBuilder builder = new StreamsBuilder();
KStream<String, JsonNode> inputStream =
builder.stream("input-records", Consumed.with(Serdes.String(), jsonNodeSerde));
inputStream.filter((key, value) -> key.equals(id))
.to("output-records", Produced.with(Serdes.String(), jsonNodeSerde));
final Topology topology = builder.build();
return topology;
}

Google diff-match-patch : How to unpatch to get Original String?

I am using Google diff-match-patch JAVA plugin to create patch between two JSON strings and storing the patch to database.
diff_match_patch dmp = new diff_match_patch();
LinkedList<Patch> diffs = dmp.patch_make(latestString, originalString);
String patch = dmp.patch_toText(diffs); // Store patch to DB
Now is there any way to use this patch to re-create the originalString by passing the latestString?
I google about this and found this very old comment # Google diff-match-patch Wiki saying,
Unpatching can be done by just looping through the diff, swapping
DIFF_INSERT with DIFF_DELETE, then applying the patch.
But i did not find any useful code that demonstrates this. How could i achieve this with my existing code ? Any pointers or code reference would be appreciated.
Edit:
The problem i am facing is, in the front-end i am showing a revisions module that shows all the transactions of a particular fragment (take for example an employee details), like which user has updated what details etc. Now i am recreating the fragment JSON by reverse applying each patch to get the current transaction data and show it as a table (using http://marianoguerra.github.io/json.human.js/). But some JSON data are not valid JSON and I am getting JSON.parse error.
I was looking to do something similar (in C#) and what is working for me with a relatively simple object is the patch_apply method. This use case seems somewhat missing from the documentation, so I'm answering here. Code is C# but the API is cross language:
static void Main(string[] args)
{
var dmp = new diff_match_patch();
string v1 = "My Json Object;
string v2 = "My Mutated Json Object"
var v2ToV1Patch = dmp.patch_make(v2, v1);
var v2ToV1PatchText = dmp.patch_toText(v2ToV1Patch); // Persist text to db
string v3 = "Latest version of JSON object;
var v3ToV2Patch = dmp.patch_make(v3, v2);
var v3ToV2PatchTxt = dmp.patch_toText(v3ToV2Patch); // Persist text to db
// Time to re-hydrate the objects
var altV3ToV2Patch = dmp.patch_fromText(v3ToV2PatchTxt);
var altV2 = dmp.patch_apply(altV3ToV2Patch, v3)[0].ToString(); // .get(0) in Java I think
var altV2ToV1Patch = dmp.patch_fromText(v2ToV1PatchText);
var altV1 = dmp.patch_apply(altV2ToV1Patch, altV2)[0].ToString();
}
I am attempting to retrofit this as an audit log, where previously the entire JSON object was saved. As the audited objects have become more complex the storage requirements have increased dramatically. I haven't yet applied this to the complex large objects, but it is possible to check if the patch was successful by checking the second object in the array returned by the patch_apply method. This is an array of boolean values, all of which should be true if the patch worked correctly. You could write some code to check this, which would help check if the object can be successfully re-hydrated from the JSON rather than just getting a parsing error. My prototype C# method looks like this:
private static bool ValidatePatch(object[] patchResult, out string patchedString)
{
patchedString = patchResult[0] as string;
var successArray = patchResult[1] as bool[];
foreach (var b in successArray)
{
if (!b)
return false;
}
return true;
}

How can I make RestSharp use BSON?

I'm using RestSharp, and using Json.NET for serialization (see here).
Json.NET supports BSON, and since some of my requests have huge blocks of binary data, I think this would speed up my application dramatically. However, as far as I can tell, RestSharp does not seem to have any in-built support for BSON.
The use of Json.NET is implemented as a custom serializer for RestSharp, and so at first glance it looks like it would be possible to modify that custom serializer to use BSON. But, the Serialize method of the RestSharp.Serializers.ISerializer interface returns a string - which is (I assume) unsuitable for BSON. So, I assume that it would take some more significant changes to RestSharp to implement this change.
Has anyone figured out a way to do this?
Update: I looked at the RestSharp source, and discovered that the RestRequest.AddBody method that takes my object and serializes it into the request body eventually calls Request.AddParameter (with the serialized object data, and the parameter type RequestBody).
I figured that I might be able to serialize my object to BSON and then call Request.AddParameter directly - and indeed I can. However, when RestSharp then executes the RestRequest, it fails to put the binary content into the request, because there are other embedded assumptions about the request content being UTF-8 encoded.
Thus it looks like this hack would not work - there would need to be some changes made to RestSharp itself, and I'm not the man for the job...
Update 2: I decided to have a go at using the debugger to figure out how much of RestSharp I'd have to change to overcome the body-encoding issue, so I swapped out my NuGet version of RestSharp and included the RestSharp project in my solution. And... it worked.
It turns out that there has been a change to RestSharp in the last few months that isn't yet in the NuGet version.
So, you can now use AddParameter and pass in an already-BSON-encoded object, and RestSharp will send it off to the server without complaint.
Per the updates in my question, it turns out that if you have the latest RestSharp source, then instead of this:
request.AddBody(myObject);
... you can do this instead whenever you have a payload that would benefit from using BSON:
using (var memoryStream = new System.IO.MemoryStream())
{
using (var bsonWriter = new Newtonsoft.Json.Bson.BsonWriter(memoryStream))
{
var serializer = new Newtonsoft.Json.JsonSerializer();
serializer.Serialize(bsonWriter, myObject);
var bytes = memoryStream.ToArray();
request.AddParameter("application/bson", bytes, RestSharp.ParameterType.RequestBody);
}
}
Note that the first parameter to AddParameter is supposedly the parameter name, but in the case of ParameterType.RequestBody it's actually used as the content type. (Yuk).
Note that this relies on a change made to RestSharp on April 11 2013 by ewanmellor/ayoung, and this change is not in the current version on NuGet (104.1). Hence this will only work if you include the current RestSharp source in your project.
Gary's answer to his own question was incredibly useful for serializing restful calls. I wanted to answer how to deserialize restful calls using JSON.NET. I am using RestSharp version 104.4 for Silverlight. My server is using Web API 2.1 with BSON support turned on.
To accept a BSON response, create a BSON Deserializer for RestSharp like so
public class BsonDeserializer : IDeserializer
{
public string RootElement { get; set; }
public string Namespace { get; set; }
public string DateFormat { get; set; }
public T Deserialize<T>(IRestResponse response)
{
using (var memoryStream = new MemoryStream(response.RawBytes))
{
using (var bsonReader = new BsonReader(memoryStream))
{
var serializer = new JsonSerializer();
return serializer.Deserialize<T>(bsonReader);
}
}
}
}
Then, ensure your request accepts "application/bson"
var request = new RestRequest(apiUrl, verb);
request.AddHeader("Accept", "application/bson");
And add a handler for that media type
var client = new RestClient(url);
client.AddHandler("application/bson", new BsonDeserializer());

Very slow performance deserializing using datacontractserializer in a Silverlight Application

Here is the situation:
Silverlight 3 Application hits an asp.net hosted WCF service to get a list of items to display in a grid. Once the list is brought down to the client it is cached in IsolatedStorage. This is done by using the DataContractSerializer to serialize all of these objects to a stream which is then zipped and then encrypted. When the application is relaunched, it first loads from the cache (reversing the process above) and the deserializes the objects using the DataContractSerializer.ReadObject() method. All of this was working wonderfully under all scenarios until recently with the entire "load from cache" path (decrypt/unzip/deserialize) taking hundreds of milliseconds at most.
On some development machines but not all (all machines Windows 7) the deserialize process - that is the call to ReadObject(stream) takes several minutes an seems to lock up the entire machine BUT ONLY WHEN RUNNING IN THE DEBUGGER in VS2008. Running the Debug configuration code outside the debugger has no problem.
One thing that seems to look suspicious is that when you turn on stop on Exceptions, you can see that the ReadObject() throws many, many System.FormatException's indicating that a number was not in the correct format. When I turn off "Just My Code" thousands of these get dumped to the screen. None go unhandled. These occur both on the read back from the cache AND on a deserialization at the conclusion of a web service call to get the data from the WCF Service. HOWEVER, these same exceptions occur on my laptop development machine that does not experience the slowness at all. And FWIW, my laptop is really old and my desktop is a 4 core, 6GB RAM beast.
Again, no problems unless running under the debugger in VS2008. Anyone else seem this? Any thoughts?
Here is the bug report link: https://connect.microsoft.com/VisualStudio/feedback/details/539609/very-slow-performance-deserializing-using-datacontractserializer-in-a-silverlight-application-only-in-debugger
EDIT: I now know where the FormatExceptions are coming from. It seems that they are "by design" - they occur when when I have doubles being serialized that are double.NaN so that that xml looks like NaN...It seems that the DCS tries to parse the value as a number, that fails with an exception and then it looks for "NaN" et. al. and handles them. My problem is not that this does not work...it does...it is just that it completely cripples the debugger. Does anyone know how to configure the debugger/vs2008sp1 to handle this more efficiently.
cartden,
You may want to consider switching over to XMLSerializer instead. Here is what I have determined over time:
The XMLSerializer and DataContractSerializer classes provides a simple means of serializing and deserializing object graphs to and from XML.
The key differences are:
1.
XMLSerializer has much smaller payload than DCS if you use [XmlAttribute] instead of [XmlElement]
DCS always store values as elements
2.
DCS is "opt-in" rather than "opt-out"
With DCS you explicitly mark what you want to serialize with [DataMember]
With DCS you can serialize any field or property, even if they are marked protected or private
With DCS you can use [IgnoreDataMember] to have the serializer ignore certain properties
With XMLSerializer public properties are serialized, and need setters to be deserialized
With XmlSerializer you can use [XmlIgnore] to have the serializer ignore public properties
3.
BE AWARE! DCS.ReadObject DOES NOT call constructors during deserialization
If you need to perform initialization, DCS supports the following callback hooks:
[OnDeserializing], [OnDeserialized], [OnSerializing], [OnSerialized]
(also useful for handling versioning issues)
If you want the ability to switch between the two serializers, you can use both sets of attributes simultaneously, as in:
[DataContract]
[XmlRoot]
public class ProfilePerson : NotifyPropertyChanges
{
[XmlAttribute]
[DataMember]
public string FirstName { get { return m_FirstName; } set { SetProperty(ref m_FirstName, value); } }
private string m_FirstName;
[XmlElement]
[DataMember]
public PersonLocation Location { get { return m_Location; } set { SetProperty(ref m_Location, value); } }
private PersonLocation m_Location = new PersonLocation(); // Should change over time
[XmlIgnore]
[IgnoreDataMember]
public Profile ParentProfile { get { return m_ParentProfile; } set { SetProperty(ref m_ParentProfile, value); } }
private Profile m_ParentProfile = null;
public ProfilePerson()
{
}
}
Also, check out my Serializer class that can switch between the two:
using System;
using System.IO;
using System.Runtime.Serialization;
using System.Text;
using System.Xml;
using System.Xml.Serialization;
namespace ClassLibrary
{
// Instantiate this class to serialize objects using either XmlSerializer or DataContractSerializer
internal class Serializer
{
private readonly bool m_bDCS;
internal Serializer(bool bDCS)
{
m_bDCS = bDCS;
}
internal TT Deserialize<TT>(string input)
{
MemoryStream stream = new MemoryStream(input.ToByteArray());
if (m_bDCS)
{
DataContractSerializer dc = new DataContractSerializer(typeof(TT));
return (TT)dc.ReadObject(stream);
}
else
{
XmlSerializer xs = new XmlSerializer(typeof(TT));
return (TT)xs.Deserialize(stream);
}
}
internal string Serialize<TT>(object obj)
{
MemoryStream stream = new MemoryStream();
if (m_bDCS)
{
DataContractSerializer dc = new DataContractSerializer(typeof(TT));
dc.WriteObject(stream, obj);
}
else
{
XmlSerializer xs = new XmlSerializer(typeof(TT));
xs.Serialize(stream, obj);
}
// be aware that the Unicode Byte-Order Mark will be at the front of the string
return stream.ToArray().ToUtfString();
}
internal string SerializeToString<TT>(object obj)
{
StringBuilder builder = new StringBuilder();
XmlWriter xmlWriter = XmlWriter.Create(builder);
if (m_bDCS)
{
DataContractSerializer dc = new DataContractSerializer(typeof(TT));
dc.WriteObject(xmlWriter, obj);
}
else
{
XmlSerializer xs = new XmlSerializer(typeof(TT));
xs.Serialize(xmlWriter, obj);
}
string xml = builder.ToString();
xml = RegexHelper.ReplacePattern(xml, RegexHelper.WildcardToPattern("<?xml*>", WildcardSearch.Anywhere), string.Empty);
xml = RegexHelper.ReplacePattern(xml, RegexHelper.WildcardToPattern(" xmlns:*\"*\"", WildcardSearch.Anywhere), string.Empty);
xml = xml.Replace(Environment.NewLine + " ", string.Empty);
xml = xml.Replace(Environment.NewLine, string.Empty);
return xml;
}
}
}
This is a guess, but I think it is running slow in debug mode because for every exception, it is performing some actions to show the exception in the debug window, etc. If you are running in release mode, these extra steps are not taken.
I've never done this, so I really don't know id it would work, but have you tried just setting that one assembly to run in release mode while all others are set to debug? If I'm right, it may solve your problem. If I'm wrong, then you only waste 1 or 2 minutes.
About your debugging problem, have you tried to disable the exception assistant ? (Tools > Options > Debugging > Enable the exception assistant).
Another point should be the exception handling in Debug > Exceptions : you can disable the user-unhandled stuff for the CLR or only uncheck the System.FormatException exception.
Ok - I figured out the root issue. It was what I alluded to in the EDIT to the main question. The problem was that in the xml, it was correctly serializing doubles that had a value of double.NaN. I was using these values to indicate "na" for when the denominator was 0D. Example: ROE (Return on Equity = Net Income / Average Equity) when Average Equity is 0D would be serialized as:
<ROE>NaN</ROE>
When the DCS tried to de-serialize it, evidently it first tries to read the number and then catches the exception when that fails and then handles the NaN. The problem is that this seems to generate a lot of overhead when in DEBUG mode.
Solution: I changed the property to double? and set it to null instead of NaN. Everything now happens instantly in DEBUG mode now. Thanks to all for your help.
Try disabling some IE addons. In my case, the LastPass toolbar killed my Silverlight debugging. My computer would freeze for minutes each time I interacted with Visual Studio after a breakpoint.