-
Notifications
You must be signed in to change notification settings - Fork 500
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Get Dataverse running on OpenShift (Docker and Kubernetes) #4040
Comments
I mentioned this issue to @bjonnh yesterday in chat because I've tagged him as the primary contact for the dev effort by the community to work on Docker support in #3938. He said that free OpenShift accounts would be helpful for him, @donsizemore, and anyone else who wants to help with the effort to get Dataverse running on OpenShift. |
You can get a free account with the starter tier: https://www.openshift.com/pricing/index.html That gets you 1GB of free memory. You can also use https://github.com/openshift/origin/blob/master/docs/cluster_up_down.md or minishift: https://www.openshift.org/minishift/ to run OpenShift on your laptop. |
Rules for writing good images: https://docs.openshift.org/latest/creating_images/guidelines.html How to set the memory based on the cgroup: There is an example in here: https://blog.openshift.com/managing-compute-resources-openshiftkubernetes/ Look under "Writing Applications". And here is an example from mysql: Our postgresql image: https://hub.docker.com/r/openshift/postgresql-92-centos7/ Example templates: https://github.com/openshift/origin/tree/master/examples hello-openshift is a fine place to start and move on to sample-app |
@danmcp thanks! @portante @danmcp @landreev @scolapasta and I had a great meeting today. Here's a picture of the whiteboard: @danmcp already did his to do list items and my todo list item is to take the latest images from https://hub.docker.com/r/ndslabs/dataverse/ and reference them in a new file at Basically, we'll be trying to see what breaks when we try to deploy the NDS Labs images "as is" to OpenShift. From the whiteboard, we'll need to dig into these questions about the DNS images in order to make sure they run on OpenShift:
We are deferring the following concerns until the future:
Basically, the definition of done for this issue is that someone interested in kicking the tires on Dataverse for non-production use will be able to spin it up for the free 1 GB Openshift "starter" plan. The whiteboard drawing offers some clues on what the pull request might look like. In our @danmcp I swung by @djbrooke 's office and we'd like to figure out when a good time to put this into a sprint would be. The first available would start next Wednesday, Sep 13 and go for two weeks. Let's not pick a time when you're on vacation! 😄 |
@pdurbin Many of the examples are in json. The templates can be either. I should be around most of the time over the next few weeks. |
@danmcp awesome. Today I signed up for an OpenShift account and went through https://docs.openshift.com/online/getting_started/index.html . That doc is slightly out of date and I got some weird errors along the way (I grabbed a screenshot if you want it) but eventually they resolved themselves and I could see at http://nodejs-mongo-persistent-pdurbin-example.1d35.starter-us-east-1.openshiftapps.com the simple change I made at pdurbin/nodejs-ex@c6efab9 . Great. I looked at https://github.com/openshift/origin/blob/v3.7.0-alpha.1/examples/hello-openshift/hello-project.json and noticed that there were no containers in there so I added a containers array under "spec" and include some images from NDS Labs. I'm currently blocked on the error "cannot create projects at the cluster scope" and left a note about this d287772 which is the first commit of a new |
@pdurbin In your case, you're not going to want to create a project but rather import into an existing project. You should have a kind of template like this one: |
@danmcp thanks, in 77b3f67 I switched from "Project" to "Template" and stubbed out in the Dataverse dev guide how to use Minishift, which I just installed and have been playing with (with some guidance from @pameyer ). I was able to expose a route but I'm not sure how to expose the Docker image at https://hub.docker.com/r/ndslabs/dataverse/ within my installation of Minishift. Any advice? |
@pdurbin You will just reference ndslabs/dataverse from an imagestream like this: Then from your container you would reference the imagestream like this: with the name of the image stream you picked. |
@danmcp thanks! I tried at pdurbin@e1e492f (pushed to my personal repo this time because of the error below) but I got a crazy error:
|
It obviously shouldn't give that error but it doesn't like your json. Try this one:
|
@danmcp thanks! Added in 4702e0a. Under "Applications" there are now entries under "Deployments" and "Pods" which seems like great progress, but I'm getting this in the log:
Here's a screenshot: |
Scratch that. I tried again and now I'm getting this:
This error seems to be coming from https://github.com/nds-org/ndslabs-dataverse/blob/9ddc9efa54185ffd69e25487159a09c4bb2e56bf/dockerfiles/dataverse/entrypoint.sh#L69 https://github.com/nds-org/ndslabs-dataverse/blob/9ddc9efa54185ffd69e25487159a09c4bb2e56bf/dockerfiles/README.md#starting-dataverse-under-docker has some nice information about how you have to start PostgreSQL and Solr before starting Dataverse, which makes sense. |
@pdurbin Similar to this example: You're going to want to add postgres and solr to the same template. And have env vars generated to connect them all together. |
@danmcp thanks, I made some progress, I think, by adding the
|
doc improvements for using OpenShift and Minishift #4040
I had a nice meeting on Friday with @danmcp @DirectXMan12 @patrickdillon @MichaelClifford Ashwin and Ryan (sorry, I don't know your GitHub username). I merged pull request #4501 from @danmcp to fix up our OpenShift config. Based on feedback in that meeting I made some improvements to the OpenShift and Docker sections of the dev guide in pull request #4500. Heads up that as part of #4419 I'm moving that content to a dedicated page (see 6bee8d1 for example). |
@patrickdillon discovered that on the "develop" branch (I just tested 5ed5edf), when you create a dataverse it is not indexed into Solr (thanks!). The UI doesn't show the facets (screenshot attached) and in server.log we see errors like this:
To be honest, I don't remember if indexing ever worked in the OpenShift environment. The main way I've been testing is by logging in. Here's a screenshot of how the dataverse I just created isn't indexed: 2018-03-21 Update: The screenshot above is actually a bad example because it's expected that a dataverse you just created doesn't have any children. However, I re-tested this yesterday and it really is broken. If you navigate to the root, nothing shows as being indexed. I'm hoping to fix this as part of the upgrade to Solr 7 in #4158. |
Also add more error checking to build.sh Also track default.config used in minishift/openshift.
Ok, I just tweaked our openshift config in 493badf an got Solr working in that branch/pull request, which hasn't been merged yet. The tag on DockerHub is called "4158-update-solr" if anyone wants to try it out. This is the branch where we're upgrading from Solr 4 to Solr 7 so when it gets merged, we'll need to push new images to the "latest" tag on Docker Hub. I should note that because I was struggling so mightily with getting Solr 7 working in openshift, I sort of gave up and started running it in /tmp over at 94786ff . At some point I'll work with @danmcp or @DirectXMan12 or some other OpenShift guru to either make this right or more preferably, move to a standard openshift-compatible image that we don't have to build ourselves, like we do with postgres (we use postgres image from centos). This one from @dudash might be a candidate but I haven't tried it yet: https://github.com/dudash/openshift-docker-solr Anyway, the other fix was to restart Glassfish to pick up the change to the |
Pull request #4520 was merged yesterday and I just ran
This means that Solr has been upgraded to Solr 7 in that image and the Dataverse war file has been updated to a version (commit 037cb9c) that's compatible with it. Again, there's technical debt in that Solr image (it's running out of |
Just a quick note to say that I just pushed images to Docker Hub as of 639715d which includes the following changes:
I did a simple test of creating a dataverse and making sure that it's indexed. It seems fine. Yesterday I went to the final demo of the stateful sets work (two of the pull requests above) that was contributed by the BU students @danmcp @DirectXMan12 and I have been mentoring all semester. I highly recommend watching their final video at https://github.com/BU-NU-CLOUD-SP18/Dataverse-Scaling#our-project-video which explains what they were up to. We're still a long way from having a production-ready environment on OpenShift for running Dataverse, but these stateful sets will help us scale Glassfish and Postgres independently in the future. As a bonus, since the project included some load testing, check out JMeter script that have been added to #4201. A huge THANK YOU to these students for all of their hard work: Patrick Dillon, Michael Clifford, Ashwin Pillai, and Ryan Morano. See also the thread at https://groups.google.com/d/msg/dataverse-community/TSxf4MTYYjg/7VJB_-GJBAAJ On a related note, another group of BU students in the same class worked on a project related to Dataverse and OpenShift. See the "Spark and Dataverse (Big Data Containers, computation)" thread for more: https://groups.google.com/d/msg/dataverse-community/P4llZSssZ2Q/zvhGltLpAQAJ . Thank you to them as well! |
Just watched the final video and it was super cool to see this in action! You guys did a great job and it is awesome to see Dataverse being able to take the steps towards being fully scalable. |
This is good news and the work is in the same direction as UiT's goals for Dataverse 👍 |
Yesterday I met with @portante and @danmcp and talked a fair amount about the possibility of getting Dataverse running on OpenShift. There are multiple reasons why I'm interested in this:
Getting Dataverse running on Openshift isn't on our roadmap so I've created this issue so we can estimate it in sprint planning or backlog grooming. Anyone reading this is very welcome to leave comments or ask questions!
The text was updated successfully, but these errors were encountered: