Can the KnowledgeAgent be used to automatically write the KnowledgeBase to a file so it can be used externally? - serialization

i'm working at a little drools project and i have following problem:
- when i read the knowledgepackages from drools via the knowledgeAgent it takes a long time to load((now i know that building the knowledgeBase in general and especially when loading packages from guvnor is very intense ))
so I'm trying to serialize the KnowledgeBase to a file which is located locally on the system - on the one hand because loading the kBase from a local file is much much faster - and for the other so that i can use the KnowledgeBase for other applications The Problem with this is, that while using the KnowledgeAgent to load the KnowledgeBase the first time, the base will be updated by the Agent automatically
BUT: whilst the Base is updated, my local file will not be updated too
So I'm wondering how to handle/get the changeNotification from my KnowledgeAgent so i can call a method to serialize my KnowledgeBase ?
Is this somehow possible? basically i just want to update my local knowledgeBase file, everytime someone edits a rule in governor, so that my local file is always up to date.
If it isn't possible, or a really bad solution to begin with, what is the recommended / best way to go about it?
Please endure my english and the question itself, if you cant really make out what i want to accomplish or if my request is actually not a good solution or the question itself is redundant, im rather new to java and a total noob when it comes to drools.
Down below is the code:
public class DroolsConnection {
private static KnowledgeAgent kAgent;
private static KnowledgeBase kAgentBase;
public DroolsConnection(){
ResourceFactory.getResourceChangeNotifierService().start();
ResourceFactory.getResourceChangeScannerService() .start();
}
public KnowledgeBase readKnowledgeBase( ) throws Exception {
kAgent = KnowledgeAgentFactory.newKnowledgeAgent("guvnorAgent");
kAgent .applyChangeSet( ResourceFactory.newFileResource(CHANGESET_PATH));
kAgent.monitorResourceChangeEvents(true);
kAgentBase = kAgent.getKnowledgeBase();
serializeKnowledgeBase(kAgentBase);
return kAgentBase;
}
public List<EvaluationObject> runAgainstRules( List<EvaluationObject> objectsToEvaluate,
KnowledgeBase kBase ) throws Exception{
StatefulKnowledgeSession knowSession = kBase.newStatefulKnowledgeSession();
KnowledgeRuntimeLogger knowLogger = KnowledgeRuntimeLoggerFactory.newFileLogger(knowSession, "logger");
for ( EvaluationObject o : objectsToEvaluate ){
knowSession.insert( o );
}
knowSession.fireAllRules();
knowLogger .close();
knowSession.dispose();
return objectsToEvaluate;
}
public KnowledgeBase serializeKnowledgeBase(KnowledgeBase kBase) throws IOException{
OutputStream outStream = new FileOutputStream( SERIALIZE_BASE_PATH );
ObjectOutputStream oos = new ObjectOutputStream( outStream );
oos.writeObject ( kBase );
oos.close();
return kBase;
}
public KnowledgeBase loadFromSerializedKnowledgeBase() throws Exception {
KnowledgeBase kBase = KnowledgeBaseFactory.newKnowledgeBase();
InputStream is = new FileInputStream( SERIALIZE_BASE_PATH );
ObjectInputStream ois = new ObjectInputStream( is );
kBase = (KnowledgeBase) ois.readObject();
ois.close();
return kBase;
}
}
thanks for your help in advance!
best regards,
Marenko

In order to keep your local kbase updated you could use a KnowledgeAgentEventListener to know when its internal kbase gets updated:
kagent.addEventListener( new KnowledgeAgentEventListener() {
public void beforeChangeSetApplied(BeforeChangeSetAppliedEvent event) {
}
public synchronized void afterChangeSetApplied(AfterChangeSetAppliedEvent event) {
}
public void beforeChangeSetProcessed(BeforeChangeSetProcessedEvent event) {
}
public void afterChangeSetProcessed(AfterChangeSetProcessedEvent event) {
}
public void beforeResourceProcessed(BeforeResourceProcessedEvent event) {
}
public void afterResourceProcessed(AfterResourceProcessedEvent event) {
}
public void knowledgeBaseUpdated(KnowledgeBaseUpdatedEvent event) {
//THIS IS THE EVENT YOU ARE INTERESTED IN
}
public void resourceCompilationFailed(ResourceCompilationFailedEvent event) {
}
} );
You still need to handle concurrently accesses on your local kbase though.
By the way, since you are not using 'newInstance' configuration option, the agent will create a new instance of a kbase each time a change-set is applied. So, make sure you serialize the kagent's internal kbase (kagent.getKnowledgeBase()) instead of the reference you have in your app.
Hope it helps,

Related

How to catch any exceptions thrown by BigQueryIO.Write and rescue the data which is failed to output?

I want to read data from Cloud Pub/Sub and write it to BigQuery with Cloud Dataflow. Each data contains a table ID where the data itself will be saved.
There are various factors that writing to BigQuery fails:
Table ID format is wrong.
Dataset does not exist.
Dataset does not allow the pipeline to access.
Network failure.
When one of the failures occurs, a streaming job will retry the task and stall. I tried using WriteResult.getFailedInserts() in order to rescue the bad data and avoid stalling, but it did not work well. Is there any good way?
Here is my code:
public class StarterPipeline {
private static final Logger LOG = LoggerFactory.getLogger(StarterPipeline.class);
public class MyData implements Serializable {
String table_id;
}
public interface MyOptions extends PipelineOptions {
#Description("PubSub topic to read from, specified as projects/<project_id>/topics/<topic_id>")
#Validation.Required
ValueProvider<String> getInputTopic();
void setInputTopic(ValueProvider<String> value);
}
public static void main(String[] args) {
MyOptions options = PipelineOptionsFactory.fromArgs(args).withValidation().as(MyOptions.class);
Pipeline p = Pipeline.create(options);
PCollection<MyData> input = p
.apply("ReadFromPubSub", PubsubIO.readStrings().fromTopic(options.getInputTopic()))
.apply("ParseJSON", MapElements.into(TypeDescriptor.of(MyData.class))
.via((String text) -> new Gson().fromJson(text, MyData.class)));
WriteResult writeResult = input
.apply("WriteToBigQuery", BigQueryIO.<MyData>write()
.to(new SerializableFunction<ValueInSingleWindow<MyData>, TableDestination>() {
#Override
public TableDestination apply(ValueInSingleWindow<MyData> input) {
MyData myData = input.getValue();
return new TableDestination(myData.table_id, null);
}
})
.withSchema(new TableSchema().setFields(new ArrayList<TableFieldSchema>() {{
add(new TableFieldSchema().setName("table_id").setType("STRING"));
}}))
.withFormatFunction(new SerializableFunction<MyData, TableRow>() {
#Override
public TableRow apply(MyData myData) {
return new TableRow().set("table_id", myData.table_id);
}
})
.withCreateDisposition(BigQueryIO.Write.CreateDisposition.CREATE_IF_NEEDED)
.withWriteDisposition(BigQueryIO.Write.WriteDisposition.WRITE_APPEND)
.withFailedInsertRetryPolicy(InsertRetryPolicy.neverRetry()));
writeResult.getFailedInserts()
.apply("LogFailedData", ParDo.of(new DoFn<TableRow, TableRow>() {
#ProcessElement
public void processElement(ProcessContext c) {
TableRow row = c.element();
LOG.info(row.get("table_id").toString());
}
}));
p.run();
}
}
There is no easy way to catch exceptions when writing to output in a pipeline definition. I suppose you could do it by writing a custom PTransform for BigQuery. However, there is no way to do it natively in Apache Beam. I also recommend against this because it undermines Cloud Dataflow's automatic retry functionality.
In your code example, you have the failed insert retry policy set to never retry. You can set the policy to always retry. This is only effective during something like an intermittent network failure (4th bullet point).
.withFailedInsertRetryPolicy(InsertRetryPolicy.alwaysRetry())
If the table ID format is incorrect (1st bullet point), then the CREATE_IF_NEEDED create disposition configuration should allow the Dataflow job to automatically create a new table without error, even if the table ID is incorrect.
If the dataset does not exist or there is an access permission issue to the dataset (2nd and 3rd bullet points), then my opinion is that the streaming job should stall and ultimately fail. There is no way to proceed under any circumstances without manual intervention.

Asp.Net core 2.0: Detect Startup class invoked from migration or other ef operation

At the current moment all default Startup.cs flow executed on every db related operation like droping db, adding migration, updating db to migrations, etc.
I have heavy app specific code in Startup which need to be invoked only if application run for real. So how could I detect that Startup class run from migration or other database related dotnet command.
Well, as it was already noticed in comment to a question there is a IDesignTimeDbContextFactory interface which need to be implemented to resolve DbContext at design time.
It could look somewhat like this:
public static class Programm{
...
public static IWebHost BuildWebHostDuringGen(string[] args)
{
return WebHost.CreateDefaultBuilder(args)
.UseStartup<StartupGen>() // <--- I'm just using different Startup child there where could be less complex code
.UseDefaultServiceProvider(options => options.ValidateScopes = false).Build();
}
}
public class DbContextFactory : IDesignTimeDbContextFactory<MyDbContext>
{
public MyDbContex CreateDbContext(string[] args)
{
return Program.BuildWebHostDuringGen(args).Services.GetRequiredService<MyDbContext>();
}
}
However, due to some unclear reasons (I asked guys from Microsoft, but they don't explain this to me) dotnet currently on every operation implicitly call Programm.BuildWebHost even if it's private - that's the reason why standard flow executed each time for the question's author. Workaround for that - Rename Programm.BuildWebHost to something else, like InitWebHost
There is an issue created for that, so maybe it will be resolved in 2.1 release on in future.
The documentation is still a bit unclear as to why this occurs. I've yet to find any concrete answer as to why it runs Startup.Configure. In 2.0 it's recommend to move any migration/seeding code to Program.Main. Here's an example by bricelam on Github.
public static IWebHost MigrateDatabase(this IWebHost webHost)
{
using (var scope = webHost.Services.CreateScope())
{
var services = scope.ServiceProvider;
try
{
var db = services.GetRequiredService<ApplicationDbContext>();
db.Database.Migrate();
}
catch (Exception ex)
{
var logger = services.GetRequiredService<ILogger<Program>>();
logger.LogError(ex, "An error occurred while migrating the database.");
}
}
return webHost;
}
public static void Main(string[] args)
{
BuildWebHost(args)
.MigrateDatabase()
.Run();
}

Wicket Deployment mode map resources wrong way

I have Page
getRootRequestMapperAsCompound().add(new NoVersionMapper("/card/${cardId}", CardPage.class));.
On this page there is TinyMCE4 editor. Which try to load images using relative path "images/1.jpg"
I've added resource mapping to allow images successfuly loaded.
mountResource("/card/image/${imageId}", imageResourceReference);
In DEVELOPMENT mode everything work nice, image are loaded in to editor, but in DEPLOYMENT mode, Page has been called twice, first time for /card/1 and second time for /card/image/1.jpg.
How to correctly mount resources for DEPLOYMENT mode?
UPDATE look like found the reason
public int getCompatibilityScore(Request request)
{
return 0; // pages always have priority over resources
}
, but then the question is: "Why it is working nice in development mode"?
Update 2 I haven't find better solution then add my own Resource Mapper with overrided getCompatibilityScore()
public class ImageResourceMapper extends ResourceMapper {
private String[] mountSegments;
public ImageResourceMapper(String path, ResourceReference resourceReference) {
super(path, resourceReference);
mountSegments = getMountSegments(path);
}
public ImageResourceMapper(String path, ResourceReference resourceReference, IPageParametersEncoder encoder) {
super(path, resourceReference, encoder);
mountSegments = getMountSegments(path);
}
#Override
public int getCompatibilityScore(Request request) {
if (urlStartsWith(request.getUrl(), mountSegments)) {
return 10;
}
return 0;
}
}

Combining dataproviders TestNG

I have read a few stackoverflow posts about combining dataproviders but I cant't get anything to work.
What I'm currently doing is a selenium test that takes screenshots of every language the site is translated to.
It simply clicks through every link while taking screenshots of it, then it switches the URL to another language and repeat.
My problem is when doing this I can't redirect my screenshots to a specific folder per "language test". To do this I need a second dataprovider, but I already have a dataprovider for this test method for running a different URL per test.
So I need to combine these two dataproviders somehow.
They currently look like this
public static Object [][] language(){
return new Object[][]{
{"https://admin-t1.taxicaller.net/login/admin.php?lang=en"},
{"https://admin-t1.taxicaller.net/login/admin.php?lang=sv"},
};
}
public static Object [][] directory(){
return new Object[][]{
{"screenshotsEnglish.dir"},
{"screenshotsSwedish.dir"},
};
}
In my test class I just want to reach these two by writing
driver.get(**url**);
// This is the screenshot method. Where "Directory" is written I decide where to save the screenshots
Properties settings = PropertiesLoader.fromResource("settings.properties");
String screenshotDir = settings.getProperty(**directory**);
screenShooter = new ScreenShooter(driver, screenshotDir, "en");
Hope I have made myself clear, appreciate all help!
Regards
public static Object[][] dp() {
return new Object[][]{
{
"https://admin-t1.taxicaller.net/login/admin.php?lang=en",
"screenshotsEnglish.dir"
},
{
"https://admin-t1.taxicaller.net/login/admin.php?lang=sv",
"screenshotsSwedish.dir"
}
};
}
#Test(dataProvider = "dp")
public void t(String url, String directory) {
driver.get(url);
Properties settings = PropertiesLoader.fromResource("settings.properties");
String screenshotDir = settings.getProperty(directory);
screenShooter = new ScreenShooter(driver, screenshotDir, "en");
/*...*/
}

How to list JBoss AS 7 datasource properties in Java code?

I'm running JBoss AS 7.1.0.CR1b. I've got several datasources defined in my standalone.xml e.g.
<subsystem xmlns="urn:jboss:domain:datasources:1.0">
<datasources>
<datasource jndi-name="java:/MyDS" pool-name="MyDS_Pool" enabled="true" use-java-context="true" use-ccm="true">
<connection-url>some-url</connection-url>
<driver>the-driver</driver>
[etc]
Everything works fine.
I'm trying to access the information contained here within my code - specifically the connection-url and driver properties.
I've tried getting the Datasource from JNDI, as normal, but it doesn't appear to provide access to these properties:
// catches removed
InitialContext context;
DataSource dataSource = null;
context = new InitialContext();
dataSource = (DataSource) context.lookup(jndi);
ClientInfo and DatabaseMetadata from a Connection object from this Datasource also don't contain these granular, JBoss properties either.
My code will be running inside the container with the datasource specfied, so all should be available. I've looked at the IronJacamar interface org.jboss.jca.common.api.metadata.ds.DataSource, and its implementing class, and these seem to have accessible hooks to the information I require, but I can't find any information on how to create such objects with these already deployed resources within the container (only constructor on impl involves inputting all properties manually).
JBoss AS 7's Command-Line Interface allows you to navigate and list the datasources as a directory system. http://www.paykin.info/java/add-datasource-programaticaly-cli-jboss-7/ provides an excellent post on how to use what I believe is the Java Management API to interact with the subsystem, but this appears to involve connecting to the target JBoss server. My code is already running within that server, so surely there must be an easier way to do this?
Hope somebody can help. Many thanks.
What you're really trying to do is a management action. The best way to is to use the management API's that are available.
Here is a simple standalone example:
public class Main {
public static void main(final String[] args) throws Exception {
final List<ModelNode> dataSources = getDataSources();
for (ModelNode dataSource : dataSources) {
System.out.printf("Datasource: %s%n", dataSource.asString());
}
}
public static List<ModelNode> getDataSources() throws IOException {
final ModelNode request = new ModelNode();
request.get(ClientConstants.OP).set("read-resource");
request.get("recursive").set(true);
request.get(ClientConstants.OP_ADDR).add("subsystem", "datasources");
ModelControllerClient client = null;
try {
client = ModelControllerClient.Factory.create(InetAddress.getByName("127.0.0.1"), 9999);
final ModelNode response = client.execute(new OperationBuilder(request).build());
reportFailure(response);
return response.get(ClientConstants.RESULT).get("data-source").asList();
} finally {
safeClose(client);
}
}
public static void safeClose(final Closeable closeable) {
if (closeable != null) try {
closeable.close();
} catch (Exception e) {
// no-op
}
}
private static void reportFailure(final ModelNode node) {
if (!node.get(ClientConstants.OUTCOME).asString().equals(ClientConstants.SUCCESS)) {
final String msg;
if (node.hasDefined(ClientConstants.FAILURE_DESCRIPTION)) {
if (node.hasDefined(ClientConstants.OP)) {
msg = String.format("Operation '%s' at address '%s' failed: %s", node.get(ClientConstants.OP), node.get(ClientConstants.OP_ADDR), node.get(ClientConstants.FAILURE_DESCRIPTION));
} else {
msg = String.format("Operation failed: %s", node.get(ClientConstants.FAILURE_DESCRIPTION));
}
} else {
msg = String.format("Operation failed: %s", node);
}
throw new RuntimeException(msg);
}
}
}
The only other way I can think of is to add module that relies on servers internals. It could be done, but I would probably use the management API first.