Skip to content

Command line length limit in GenerateProtoTask causes overwriting of proto descriptors #774

@SlavikN14

Description

@SlavikN14

Description

When compiling a large number of .proto files, the generated command exceeds the Default CMD character limit.

// Most OSs impose some kind of command length limit.
// Rather than account for all cases, pick a reasonable default of 64K.
  static final int DEFAULT_CMD_LENGTH_LIMIT = 65536

To circumvent this, the generateCmds function splits the command into multiple smaller commands. However, each of these commands writes to the same output descriptor file via --descriptor_set_out, causing the last command to overwrite the previous ones.

Affected Code

GenerateProtoTask.groovy (Lines 187-211)

static List<List<String>> generateCmds(List<String> baseCmd, List<File> protoFiles, int cmdLengthLimit) {
  List<List<String>> cmds = []
  if (!protoFiles.isEmpty()) {
    int baseCmdLength = baseCmd.sum { it.length() + CMD_ARGUMENT_EXTRA_LENGTH } as int
    List<String> currentArgs = []
    int currentArgsLength = 0
    for (File proto: protoFiles) {
      String protoFileName = proto
      int currentFileLength = protoFileName.length() + CMD_ARGUMENT_EXTRA_LENGTH
      if (baseCmdLength + currentArgsLength + currentFileLength > cmdLengthLimit) {
        cmds.add(baseCmd + currentArgs) // Adds a command before overflow
        currentArgs.clear()
        currentArgsLength = 0
      }
      currentArgs.add(protoFileName)
      currentArgsLength += currentFileLength
    }
    cmds.add(baseCmd + currentArgs)
  }
  return cmds
}

Expected Behavior

Each split command should write to a unique descriptor file and then merge them, preventing data loss.

Actual Behavior

Each generated command uses the same --descriptor_set_out parameter, leading to overwriting instead of appending.

Steps to Reproduce

  1. Compile a large number of .proto files that exceed the CMD character limit.
  2. Observe that multiple commands are executed.
  3. Check the final descriptor file – it contains only the last batch of .proto files.

Proposed Fix

  • Modify the --descriptor_set_out path for each split command.
  • Merge the descriptor files after all commands are executed.

Would appreciate feedback or any alternative suggestions! 🚀

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions