How to recursively parse xsd files to generate a list of included schemas for incremental build in Maven? - maven-2

I have a Maven project that uses the jaxb2-maven-plugin to compile some xsd files. It uses the staleFile to determine whether or not any of the referenced schemaFiles have been changed. Unfortunately, the xsd files in question use <xs:include schemaLocation="../relative/path.xsd"/> tags to include other schema files that are not listed in the schemaFile argument so the staleFile calculation in the plugin doesn't accurately detect when things need to be actually recompiled. This winds up breaking incremental builds as the included schemas evolve.
Obviously, one solution would be to list all the recursively referenced files in the execution's schemaFile. However, there are going to be cases where developers don't do this and break the build. I'd like instead to automate the generation of this list in some way.
One approach that comes to mind would be to somehow parse the top-level XSD files and then either sets a property or outputs a file that I can then pass into the schemaFile parameter or schemaFiles parameter. The Groovy gmaven plugin seems like it might be a natural way to embed that functionality right into the POM. But I'm not familiar enough with Groovy to get started.
Can anyone provide some sample code? Or offer an alternative implementation/solution?
Thanks!

Not sure how you'd integrate it into your Maven build -- Maven isn't really my thing :-(
However, if you have the path to an xsd file, you should be able to get the files it references by doing something like:
def rootXsd = new File( 'path/to/xsd' )
def refs = new XmlSlurper().parse( rootXsd ).depthFirst().findAll { it.name()=='include' }.#schemaLocation*.text()
println "$rootXsd references $refs"
So refs is a list of Strings which should be the paths to the included xsds

Based on tim_yates's answer, the following is a workable solution, which you may have to customize based on how you are configuring the jaxb2 plugin.
Configure a gmaven-plugin execution early in the lifecycle (e.g., in the initialize phase) that runs with the following configuration...
Start with a function to collect File objects of referenced schemas (this is a refinement of Tim's answer):
def findRefs { f ->
def relPaths = new XmlSlurper().parse(f).depthFirst().findAll {
it.name()=='include'
}*.#schemaLocation*.text()
relPaths.collect { new File(f.absoluteFile.parent + "/" + it).canonicalFile }
}
Wrap that in a function that iterates on the results until all children are found:
def recursiveFindRefs = { schemaFiles ->
def outputs = [] as Set
def inputs = schemaFiles as Queue
// Breadth-first examine all refs in all schema files
while (xsd = inputs.poll()) {
outputs << xsd
findRefs(xsd).each {
if (!outputs.contains(it)) inputs.add(it)
}
}
outputs
}
The real magic then comes in when you parse the Maven project to determine what to do.
First, find the JAXB plugin:
jaxb = project.build.plugins.find { it.artifactId == 'jaxb2-maven-plugin' }
Then, parse each execution of that plugin (if you have multiple). The code assumes that each execution sets schemaDirectory, schemaFiles and staleFile (i.e., does not use the defaults!) and that you are not using schemaListFileName:
jaxb.executions.each { ex ->
log.info("Processing jaxb execution $ex")
// Extract the schema locations; the configuration is an Xpp3Dom
ex.configuration.children.each { conf ->
switch (conf.name) {
case "schemaDirectory":
schemaDirectory = conf.value
break
case "schemaFiles":
schemaFiles = conf.value.split(/,\s*/)
break
case "staleFile":
staleFile = conf.value
break
}
}
Finally, we can open the schemaFiles, parse them using the functions we've defined earlier:
def schemaHandles = schemaFiles.collect { new File("${project.basedir}/${schemaDirectory}", it) }
def allSchemaHandles = recursiveFindRefs(schemaHandles)
...and compare their last modified times against the stale file's modification time,
unlinking the stale file if necessary.
def maxLastModified = allSchemaHandles.collect {
it.lastModified()
}.max()
def staleHandle = new File(staleFile)
if (staleHandle.lastModified() < maxLastModified) {
log.info(" New schemas detected; unlinking $staleFile.")
staleHandle.delete()
}
}

Related

How to include proto files from one project in another project

I am not able to include protos present in Project A in a project B. The idea would be to have the protos with GrpcServices="Server" in project A and in project B, of tests, include the same protos but, now, as GrpcServices="Client"
ProjectA/Protos/Profile.proto
syntax = "proto3";
package profile;
option csharp_namespace = "ProjectA.Protos";
import "google/protobuf/empty.proto";
service ProfileService {
rpc Get(google.protobuf.Empty) returns (Profile);
}
message Profile {
string profile_id = 1;
string description = 2;
}
ProjectA/Protos/User.proto
syntax = "proto3";
package user;
option csharp_namespace = "ProjectA.Protos";
import "google/protobuf/wrappers.proto";
import "Protos/Profile.proto";
service UserService {
rpc Get(google.protobuf.StringValue) returns (UserDetail);
}
message UserDetail {
string id = 1;
string name = 2;
repeated profile .Profile profiles = 7;
}
Project B .csproj (The test project)
<ItemGroup>
<Protobuf Include="..\ProjectA\Protos\*.proto" GrpcServices="Client" ProtoRoot="Protos">
<Link>Protos\*.proto</Link>
</Protobuf>
</ItemGroup>
With these settings I always end up having this error return
error : File does not reside within any path specified using --proto_path (or -I). You must specify a --proto_path which encompasses this file. Note that the proto_path must be an exact prefix of the .proto file names -- protoc is too dumb to figure out when two paths (e.g. absolute and relative) are equivalent (it's harder than you think).
Bringing a return on the applied solution.
The problem that occurs when trying to do this import in the same solution is that each inclusion in a .csproj it will try to generate C# classes and these classes are generated with "global context" and this prevents the generation of "server" and "client" classes because they would have the same names.
The easiest solution was to add Both to the proto file's include configuration. Thus, the generated C# classes already contemplate the client and server side, so it is possible to use them both in project A (server) and in project B (client).
<Protobuf Include="Protos\*.proto" GrpcServices="Both" />
The solution presented by Jan Tattermusch, in the comments, would have a project C with only the protos files, would follow the same configuration and is also a great alternative.

Use of Gradle Kotlin DSL Jar.from()

I'm trying to include a single source file for the Main-Class of a jar -- actually I have a toplevel directory of such files, demo/, but I don't want them all in a jar. I want separate jars, each using only one of these.
This seems like sort of an anti-pattern in gradle, as the fundamental mechanism infers or prefers that I should instead place each in a distinct sourceSet. Ugh.
A casual reading of the docs implies Jar.from() might be useful this way: "Specifies the source files or directories..."
As it turns out, "source" is perhaps a bit of a misnomer. Here's an example, a typical kotlin fat jar with the added from("demo/LockingBufferDemo.kt"):
val jar by tasks.getting(Jar::class) {
manifest { attributes["Main-Class"] = "LockingBufferDemoKt" }
from(sourceSets.main.get().output)
from("demo/LockingBufferDemo.kt")
dependsOn(configurations.runtimeClasspath)
from({
configurations.runtimeClasspath.get().filter {
it.name.endsWith("jar") }.map { zipTree(it) }
})
}
Forgive my naivety: Guess what does not end up in the jar? LockingBufferDemo.class. Guess what does? LockingBufferDemo.kt. In other words, this is treated more like a resource, not a source, and what would have been the simplest answer is a dead end.
Another way to approach this would be add the demo directory as an independent sourceSet and then use from(sourceSets["demo"].get(), except I can't find a way to complete that; according to IntelliJ get() returns a rather opaque "Provider" which I can't find mentioned in the actual javadoc: 1, 2 and I really feel like I'm heading down the garden path at this point with the woods rapidly growing darker around me.
This should not be this complicated.
How can I add a single file (or class derived from such) into a jar in gradle without having to put it alone in a directory and create a sourceSet for every such directory?
Regarding your explanations at the start of your post, you should consider creating multiple tasks of type Jar on your own, as every task of type Jar will only create a single JAR-file, and you "want separate jars". I do not think you should use different source sets, as all of the files are Java Kotlin source files in the end and are processed in the same way (compilation, tests, docs ...). Multiple source sets would complicate this common pipeline.
"Specifies the source files or directories..." As it turns out, "source" is perhaps a bit of a misnomer.
Well, the documentation does not stop there, but it says "for a copy and creates a child CopySpec". So it is not the source as in source code, but the source of a copy operation. In Gradle, tasks that create an archive (ZIP, JAR) share their API with tasks that copy files, as the creation of an archive can be seen as copying files from their source location to their target location (inside the archive).
So, the from method can be used to specify the files that are copied / archived. But it does not only take a sourcePath parameter, but also a closure or action for configuration. Using this second parameter, you can narrow your source files or directories down to the one file you need, for example using the method include:
val jar by tasks.getting(Jar::class) {
manifest { attributes["Main-Class"] = "LockingBufferDemoKt" }
from(sourceSets.main.get().output) {
include("**/LockingBufferDemo.class")
}
dependsOn(configurations.runtimeClasspath)
from({
configurations.runtimeClasspath.get().filter {
it.name.endsWith("jar") }.map { zipTree(it) }
})
}

How to disable default gradle buildType suffix (-release, -debug)

I migrated a 3rd-party tool's gradle.build configs, so it uses android gradle plugin 3.5.3 and gradle 5.4.1.
The build goes all smoothly, but when I'm trying to make an .aab archive, things got broken because the toolchain expects the output .aab file to be named MyApplicationId.aab, but the new gradle defaults to output MyApplicationId-release.aab, with the buildType suffix which wasn't there.
I tried to search for a solution, but documentations about product flavors are mostly about adding suffix. How do I prevent the default "-release" suffix to be added? There wasn't any product flavor blocks in the toolchain's gradle config files.
I realzed that I have to create custom tasks after reading other questions and answers:
How to change the generated filename for App Bundles with Gradle?
Renaming applicationVariants.outputs' outputFileName does not work because those are for .apks.
I'm using Gradle 5.4.1 so my Copy task syntax reference is here.
I don't quite understand where the "app.aab" name string came from, so I defined my own aabFile name string to match my toolchain's output.
I don't care about the source file so it's not deleted by another delete task.
Also my toolchain seems to be removing unknown variables surrounded by "${}" so I had to work around ${buildDir} and ${flavor} by omitting the brackets and using concatenation for proper delimiting.
tasks.whenTaskAdded { task ->
if (task.name.startsWith("bundle")) { // e.g: buildRelease
def renameTaskName = "rename${task.name.capitalize()}Aab" // renameBundleReleaseAab
def flavorSuffix = task.name.substring("bundle".length()).uncapitalize() // "release"
tasks.create(renameTaskName, Copy) {
def path = "$buildDir/outputs/bundle/" + "$flavorSuffix/"
def aabFile = "${android.defaultConfig.applicationId}-" + "$flavorSuffix" + ".aab"
from(path) {
include aabFile
rename aabFile, "${android.defaultConfig.applicationId}.aab"
}
into path
}
task.finalizedBy(renameTaskName)
}
}
As the original answer said: This will add more tasks than necessary, but those tasks will be skipped since they don't match any folder.
e.g.
Task :app:renameBundleReleaseResourcesAab NO-SOURCE

Programmatically having ivy fetch sources

We have a custom build tool which is dependent on the ivy functionality to resolve dependencies. The configuration of the dependencies is not an ivy.xml file, but a custom configuration that allows for.. well, irrelevant. The key is that we're using ivy programmatically.
Given a dependency (group id, artifact id, version), we create a ModuleRevisionId:
ModuleRevisionId id = ModuleRevisionId.newInstance(orgName, moduleName, revisionName);
followed by a ModuleDescriptor. This is, I'm guessing, where I'm not convincing enough to inform ivy that I want both the target library jar file as well as the sources. I'm just not sure what a DependencyConfiguration is vs. just a 'configuration' when creating a ModuleDescriptor.
DefaultModuleDescriptor md
= new DefaultModuleDescriptor(
ModuleRevisionId.parse("org#standalone;working"),
"integration",
new java.util.Date());
DefaultDependencyDescriptor mainDep
= new DefaultDependencyDescriptor(id, /* force = */ true);
mainDep.addDependencyConfiguration("compile", "compile");
mainDep.addDependencyConfiguration("compile", "sources");
md.addDependency(mainDep);
md.addConfiguration(new Configuration("compile"));
md.addConfiguration(new Configuration("sources"));
Nor do I really understand the above vs. RetrieveOptions vs. ResolveOptions.
I need a drink.
Ok, so it took a while, but I finally wrapped my head around some of this.
// define 'our' module
DefaultModuleDescriptor md
= new DefaultModuleDescriptor(ModuleRevisionId.parse("org#standalone;working"),
/* status = */ "integration",
new java.util.Date());
// add a configuration to our module definition
md.addConfiguration(new Configuration("compile"));
// define a dependency our module has on the (third party, typically) dependee module
DefaultDependencyDescriptor mainDep = new DefaultDependencyDescriptor(md, dependeeModuleId, /* force = */ true, false, true);
mainDep.addDependencyConfiguration("compile", "default");
mainDep.addDependencyConfiguration("compile", "sources");
// define which configurations we want to resolve (only have 1 in this case anyway)
ResolveOptions resolveOptions = new ResolveOptions();
String[] confs = new String[] {"compile"};
resolveOptions.setConfs(confs);
resolveOptions.setTransitive(true); // default anyway
resolveOptions.setDownload(true); // default anyway
ResolveReport report = ivy.resolve(md, resolveOptions);
This pulls down both the default jar as well as the sources target. Note that ivy has an issue where it won't transitively pull sources, though it will transitively pull 'main' jars. So you only get the sources for immediate dependency defined here, not the sub dependencies.
One other weakness I'm trying to figure out is this assumes the target dependency has a 'sources' configuration. I'd rather tell it to get any artifacts of type sources/source/src. Haven't figured that one out yet.

How do I dynamically trigger downstream builds in jenkins?

We want to dynamically trigger integration tests in different downstream builds in jenkins. We have a parametrized integration test project that takes a test name as a parameter. We dynamically determine our test names from the git repo.
We have a parent project that uses jenkins-cli to start a build of the integration project for each test found in the source code. The parent project and integration project are related via matching fingerprints.
The problem with this approach is that the aggregate test results doesn't work. I think the problem is that the "downstream" integration tests are started via jenkins-cli, so jenkins doesn't realize they are downstream.
I've looked at many jenkins plugins to try to get this working. The Join and Parameterized Trigger plugins don't help because they expect a static list of projects to build. The parameter factories available for Parameterized Trigger won't work either because there's no factory to create an arbitrary list of parameters. The Log Trigger plugin won't work.
The Groovy Postbuild Plugin looks like it should work, but I couldn't figure out how to trigger a build from it.
def job = hudson.model.Hudson.instance.getJob("job")
def params = new StringParameterValue('PARAMTEST', "somestring")
def paramsAction = new ParametersAction(params)
def cause = new hudson.model.Cause.UpstreamCause(currentBuild)
def causeAction = new hudson.model.CauseAction(cause)
hudson.model.Hudson.instance.queue.schedule(job, 0, causeAction, paramsAction)
This is what finally worked for me.
NOTE: The Pipeline Plugin should render this question moot, but I haven't had a chance to update our infrastructure.
To start a downstream job without parameters:
job = manager.hudson.getItem(name)
cause = new hudson.model.Cause.UpstreamCause(manager.build)
causeAction = new hudson.model.CauseAction(cause)
manager.hudson.queue.schedule(job, 0, causeAction)
To start a downstream job with parameters, you have to add a ParametersAction. Suppose Job1 has parameters A and C which default to "B" and "D" respectively. I.e.:
A == "B"
C == "D"
Suppose Job2 has the same A and B parameters, but also takes parameter E which defaults to "F". The following post build script in Job1 will copy its A and C parameters and set parameter E to the concatenation of A's and C's values:
params = []
val = ''
manager.build.properties.actions.each {
if (it instanceof hudson.model.ParametersAction) {
it.parameters.each {
value = it.createVariableResolver(manager.build).resolve(it.name)
params += it
val += value
}
}
}
params += new hudson.model.StringParameterValue('E', val)
paramsAction = new hudson.model.ParametersAction(params)
jobName = 'Job2'
job = manager.hudson.getItem(jobName)
cause = new hudson.model.Cause.UpstreamCause(manager.build)
causeAction = new hudson.model.CauseAction(cause)
def waitingItem = manager.hudson.queue.schedule(job, 0, causeAction, paramsAction)
def childFuture = waitingItem.getFuture()
def childBuild = childFuture.get()
hudson.plugins.parameterizedtrigger.BuildInfoExporterAction.addBuildInfoExporterAction(
manager.build, childProjectName, childBuild.number, childBuild.result
)
You have to add $JENKINS_HOME/plugins/parameterized-trigger/WEB-INF/classes to the Groovy Postbuild plugin's Additional groovy classpath.
Execute this Groovy script
import hudson.model.*
import jenkins.model.*
def build = Thread.currentThread().executable
def jobPattern = "PUTHEREYOURJOBNAME"
def matchedJobs = Jenkins.instance.items.findAll { job ->
job.name =~ /$jobPattern/
}
matchedJobs.each { job ->
println "Scheduling job name is: ${job.name}"
job.scheduleBuild(1, new Cause.UpstreamCause(build), new ParametersAction([ new StringParameterValue("PROPERTY1", "PROPERTY1VALUE"),new StringParameterValue("PROPERTY2", "PROPERTY2VALUE")]))
}
If you don't need to pass in properties from one build to the other just take the ParametersAction out.
The build you scheduled will have the same "cause" as your initial build. That's a nice way to pass in the "Changes". If you don't need this just do not use new Cause.UpstreamCause(build) in the function call
Since you are already starting the downstream jobs dynamically, how about you wait until they done and copy the test result files (I would archive them on the downstream jobs and then just download the 'build' artifacts) to the parent workspace. You might need to aggregate the files manually, depending if the Test plugin can work with several test result pages. In the post build step of the parent jobs configure the appropriate test plugin.
Using the Groovy Postbuild Plugin, maybe something like this will work (haven't tried it)
def job = hudson.getItem(jobname)
hudson.queue.schedule(job)
I am actually surprised that if you fingerprint both jobs (e.g. with the BUILD_TAG variable of the parent job) the aggregated results are not picked up. In my understanding Jenkins simply looks at md5sums to relate jobs (Aggregate downstream test results and triggering via the cli should not affect aggregating results. Somehow, there is something additional going on to maintain the upstream/downstream relation that I am not aware of...
This worked for me using "Execute system groovy
script"
import hudson.model.*
def currentBuild = Thread.currentThread().executable
def job = hudson.model.Hudson.instance.getJob("jobname")
def params = new StringParameterValue('paramname', "somestring")
def paramsAction = new ParametersAction(params)
def cause = new hudson.model.Cause.UpstreamCause(currentBuild)
def causeAction = new hudson.model.CauseAction(cause)
hudson.model.Hudson.instance.queue.schedule(job, 0, causeAction, paramsAction)