In some cases inline documents are not sufficient for storing extended information to a document. This is especially the case if these information might be relevant from outside as well. Projects, components and releases contain attachments. The metadata of these attachments are stored as inline documents inside its parent document (which is the project, component or release). However these attachments may be used by other documents as well, e.g. license info files which are attached to releases are used by projects to generate the overall license information for that project. In such cases an external document might be the better model. For example the attachment usage can be stored along the metadata without touching the owner document on update.
In any case it is highly dependent on the use case whether external documents are to be favored over internal documents.
Ease of Use | Performance | Integration Effort |
---|---|---|
⭐⭐ Middle, special views have to be created, fields of data objects has to be annotated. | ⭐⭐⭐ Very good, fetching of multiple documents with a single request. | ⭐ High, since existing code has to be changed. |
At the time of writing, support of external (or linked) documents in Couch-DB is limited. Consider the following documents:
project = {
_id: "p1",
type: "project",
name: "Testproject",
attachments: [
{ _id: "a1" },
{ _id: "a2" }
]
}
attachment1 = {
_id: "a1",
type: "attachment",
name: "SourceFile",
sha1: "abc1234"
}
attachment2 = {
_id: "a2",
type: "attachment",
name: "LicenseFile",
sha1: "fed9876"
}
Unfortunately there is no way to get the project document with the attachments directly included. With the correct view you are able to retrieve all these documents in a single request:
function(doc) {
if(doc.type === "attachment") {
emit(doc._id, null);
for(var i in doc.attachments) {
emit(doc._id, { _id: doc.attachments[i]._id });
}
}
}
You might see the trick: the project document as well as the attachment documents are indexed with the id of the project. This way you get all three documents when querying the view with the id of the project:
{
"total_rows":5,
"offset":0,
"rows":[{
"id":"p1",
"key": "p1",
"doc":{
"_id":"p1",
"attachments":[
"a1", "a2"
],
"name":"Testproject",
...
},
...
}, {
"id":"p1",
"doc":{
"_id":"a1",
name: "SourceFile",
...
},
...
}, {
"id":"p1",
"key": "p1",
"value":null,
"doc":{
"_id":"a2",
name: "LicenseFile",
...
},
...
}
]
}
Note is will only work if you query the view with include_docs
set to true
.
Note include_docs will only work at the top level of a value. In other words it will only recognize the following to situations:
https://github.com/eclipse/sw360/pull/596 show an implementation to transparently read such results from Couch-DB. It consists of:
Have a look at mapping function above in the theory section. Of course you may add more than one type of linked documents, e.g. not only attachments but releases as well. You may also emit whole objects instead of ids only. This way Couch-DB does not have to lookup each entry. However including ids over objects is an own topic.
You should write methods in your repository as well as in your database handler that uses the new methods from the database connector.
Be sure that the used object mapper in your database handler is aware of the mixin. Of course you can annotate more than one field. All annotated fields will be respected on loading. However, if the view does not contain an object that should be resolved, it will be replaced by null. The LinkedDocuments-annotation even allows you to name a different destination field for the resolved objects for easier integration into the existing code.
Easy to use? | Performance? | Effort to use in existing code |
---|---|---|
:no_entry: does not work | :no_entry: | :no_entry: |
Since SW360 is using Ektorp as Objectmapper, a response like above is not suitable. Ektorp is just not able to parse the above response correctly.
However Ektorp has a linking feature as well: You may annotate fields with the @DocumentReference
-Annotation to tell Ektorp to store the content within external documents. This only works with fields of type Set
at the moment of writing. Since SW360 data objects are generated using Thrift, directly annotating the field is not possible. Due to the mixin feature of Ektorp this is not a big issue. Unfortunately making the @DocumentReference
-annotation to work was not possible with a reasonable effort.
Internally Ektorp is also using special views for getting linked documents to work. A quick look into the source codes suggests that this feature is implemented using special serializers which would lead to additional requests on loading and storing as well. Therefore the same performance issues might be come across if the annotation would work.
Easy to use? | Performance? | Effort to use in existing code |
---|---|---|
⭐⭐⭐ Quite easy, just some Jackson configuration necessary | ⭐⭐ Good, but every type of linked objects needs an additional request | ⭐⭐⭐ Low, existing code does not have to be changed |
This method works just like the Ektorp way. In addition a slow transition from internal to external documents is possible, since the custom serialization methods will handle both cases directly. Any embedded documents will be externalized on first update of the owner object. The following classes are needed:
This will configure Ektorp to use a special class for this field. We use a special serializer for the field instead of for the type (in this case Attachment), so we can do serialization/deserialization for all attachments at once. If we would use a special serializer, every
public abstract class SplitAttachmentsMixin extends DatabaseMixIn {
@JsonSerialize(using = AttachmentSetSerializer.class)
@JsonDeserialize(using = AttachmentSetDeserializer.class)
public abstract void setAttachments(Set<Attachment> attachments);
}
public class SplitAttachmentsMapperFactory extends MapperFactory {
private final AttachmentHandlerInstantiator handlerInitiator;
public SplitAttachmentsMapperFactory(Supplier<HttpClient> httpClient, String dbName) throws MalformedURLException {
handlerInitiator = new AttachmentHandlerInstantiator(httpClient, dbName);
}
@Override
public ObjectMapper createObjectMapper() {
ObjectMapper objectMapper = super.createObjectMapper();
objectMapper.addMixInAnnotations(Project.class, SplitAttachmentsMixin.class);
objectMapper.setHandlerInstantiator(handlerInitiator);
return objectMapper;
}
private static class AttachmentHandlerInstantiator extends HandlerInstantiator {
private final AttachmentSetSerializer attachmentSetSerializer;
private final AttachmentSetDeserializer attachmentSetDeserializer;
public AttachmentHandlerInstantiator(Supplier<HttpClient> httpClient, String dbName) throws MalformedURLException {
attachmentSetSerializer = new AttachmentSetSerializer(httpClient, dbName);
attachmentSetDeserializer = new AttachmentSetDeserializer(httpClient, dbName);
}
@Override
public JsonDeserializer<?> deserializerInstance(DeserializationConfig config, Annotated annotated, Class<?> deserClass) {
if (deserClass.isInstance(attachmentSetDeserializer)) {
return attachmentSetDeserializer;
}
return null;
}
...
}
}
public class AttachmentSetSerializer extends JsonSerializer<Set<Attachment>> {
private final AttachmentDatabaseHandler handler;
public AttachmentSetSerializer(Supplier<HttpClient> httpClient, String dbName) throws MalformedURLException {
this.handler = new AttachmentDatabaseHandler(httpClient, dbName);
}
@Override
public void serialize(Set<Attachment> attachments, JsonGenerator jsonGenerator, SerializerProvider provider)
throws IOException, JsonProcessingException {
try {
List<DocumentOperationResult> results = handler.bulkCreateOrUpdateAttachments(attachments);
if (!results.isEmpty()) {
throw new IOException("Cannot create or update attachments. Some failed: " + results);
}
} catch (SW360Exception exception) {
throw new IOException("Cannot create or update attachments.", exception);
}
jsonGenerator.writeStartArray();
for (Attachment attachment : attachments) {
jsonGenerator.writeStartObject();
jsonGenerator.writeStringField("_id", attachment.getId());
jsonGenerator.writeEndObject();
}
jsonGenerator.writeEndArray();
}
}
public class AttachmentSetDeserializer extends JsonDeserializer<Set<Attachment>> {
private final AttachmentDatabaseHandler handler;
public AttachmentSetDeserializer(Supplier<HttpClient> httpClient, String dbName) throws MalformedURLException {
this.handler = new AttachmentDatabaseHandler(httpClient, dbName);
}
@Override
public Set<Attachment> deserialize(JsonParser jsonParser, DeserializationContext context) throws IOException, JsonProcessingException {
Set<Attachment> attachments = Sets.newHashSet();
if (!jsonParser.isExpectedStartArrayToken()) {
throw new IllegalStateException("Expected array token but found: " + jsonParser.getCurrentToken().asString());
}
Set<String> attachmentIds = Sets.newHashSet();
JsonToken token = jsonParser.nextToken();
while (!JsonToken.END_ARRAY.equals(token)) {
switch (token) {
case START_OBJECT:
Attachment attachment = jsonParser.readValueAs(Attachment.class);
if (attachment.isSetId() && !attachment.isSetRevision()) {
attachmentIds.add(attachment.getId());
} else {
attachments.add(attachment);
}
break;
default:
throw new IllegalStateException(
"Unexpected token. Expected object or string but found: " + jsonParser.getCurrentToken().asString());
}
token = jsonParser.nextToken();
}
if (!attachmentIds.isEmpty()) {
try {
attachments.addAll(handler.retrieveAttachments(attachmentIds));
} catch (SW360Exception exception) {
throw new IOException("Cannot load attachments (" + attachmentIds + ")", exception);
}
}
return attachments;
}
}