Oracle Java downloading issues

Many team members are familiar with the intermittent problem with Oracle JDK downloading:

fatal: [149.202.161.220]: FAILED! => {
    "changed": false,
    "dest": "/var/tmp/jdk-8u131-linux-x64.tar.gz",
    "elapsed": 1,
    "msg": "Request failed",
    "response": "HTTP Error 403: Forbidden",
    "status_code": 403,
    "url": "http://download.oracle.com/otn-pub/java/jdk/8u131-b11/d54c1d3a095b4ff2b6607d096fa80163/jdk-8u131-linux-x64.tar.gz"
}

We reported this issue multiple times, but it looks like there are not enough people interested in fixing it.

Meanwhile, it is already the most popular deployment failure reason, and starting to be really annoying (just search “oracle” in MM) — sometimes it can block work for days.

There are three known options to resolve this:

  1. Self-host Oracle JDK file, and substitute URL in configuration. This is the easiest solution, but there might be licensing issues with redistribution. Please comment if you know for sure that we won’t have problems if we self-host it.
  2. Set up a caching proxy and download Oracle JDK through it.
  3. Switch to OpenJDK. This can be upstreamed, and at least two team members (@sid and @adolfo) tried to do this, or even succeed, but I’m not aware of results.

Any thoughts on what option is the best? Once we reach consensus, I can create and take a ticket to fix this.

[ Ticket to log time ].

IANAL, but I’d say this is the most straightforward solution since we are only using it for deploying our instances and not redistributing it (which we don’t have the rights for). We just need to make sure that we don’t make this URL accessible to everyone on the internet.

3 Likes

I’d say if @adolfo or @sid has success with replacing Oracle JDK with OpenJDK, then we should go that way because it does not require (hopefully) much effort (since the hard part is already done) and we could help others as well in a form of contribution to edX.

Otherwise, we could simply do what @guruprasad mentioned.

1 Like

I would vote for the caching proxy option since it has the potential to speed up all downloads, including those from PyPI. It’s worth testing this.

1 Like

@adolfo, what would you say? Is it possible to upstream replacing Oracle JDK with OpenJDK? I’m asking because there must be reason edX has gone with proprietary solution in the first place.

Otherwise, I agree with @kshitij and would go with caching proxy option and schedule a discovery on how to set up something like Squid and integrate in our infrastructure.

Here’s what I think. OpenJDK works just fine, and we should certainly push in that direction, but:

  • I don’t know how hard it’ll be to upstream to edx/configuration. As far as I understand things, edX still uses some of the playbooks to deploy their stuff - but not necessarily any roles that depend on Java.
  • Tutor just uses upstream Elasticsearch images, AFAICT, so that would be a non-issue there.
  • A caching proxy is a good idea whatever else we do, so +1 to that.

So long as the solution is configurable, and defaults to the current Oracle JDK option, then upstreaming shouldn’t be a blocker here.

There’s an issue under the Periodic Builds epic to address this: SE-3963, and I’m working to get it assigned for next sprint.

Thanks, @jill.

Just in case, I decided to go with the easiest solution and included BB-4097 in the current sprint. My reasoning is that this issue gives us too many troubles, so we need to have at least a temporary fix.

But yes, the ideal solution is to switch to OpenJDK & add caching proxy. For the latter I’m going to create a ticket or small epic.

@demid I’m not sure we can legally proxy that download… cf Slack comment.

@jill, some of our clients use pre-built AMIs, that likely contain this JDK archive, so we already distribute it.

@jill, IANAL, but all the license terms only apply for redistribution. A caching proxy can’t be considered redistribution? We can also self-host the file(s) and as long as it is not accessible to everyone on the internet and is only used for OpenCraft internal purposes, even that wouldn’t qualify as redistribution?

1 Like

IANAL either, but since Oracle are throwing up more and more barriers to automatic downloading, like requiring logins and acceptance of licenses, I definitely think a publicly-accessible caching proxy would not be well received.

A private one may be ok, but I don’t know.