[LMR-4] Cannot connect to Ceph S3 bucket using lhsmd s3 plugin Created: 17/Mar/17  Updated: 08/Feb/24  Resolved: 08/Feb/24

Status: Resolved
Project: Lemur
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: Mangala Jyothi Bhaskar Assignee: Michael MacDonald (Inactive)
Resolution: Incomplete Votes: 0
Labels: None
Environment:

Lemur : lemur-0.5.2
Ceph S3 : jewel : ceph version 10.2.5
IEEL3.0 servers Lustre 2.7.15.3
Lustre client 2.7.16.10
Kernel : 3.10.0-327.36.2


Attachments: Text File EndpointVerification.txt     Text File Log.txt     Text File agent.txt     Text File radoslog.txt    
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

I have a Ceph cluster running Jewel ( ceph version 10.2.5 ) and I understand that this version does support Auth V4 for S3. 

I created a public bucket and I am trying to connect to this bucket on a node using lhsm S3 plugin. This node has Lemur 0.5.2 installed on it and I did verify that this node can talk to my Ceph gateway and I was able to successfully resolve the end point on this host. However, when using the same end point and the public bucket, S3 plugin fails to connect with some errors.  I am not sure if it could be an error on my end with the settings or something else.  Could you please take a close look at the logs and the plugin config file to help me out? 

 

Attachments: 

Log.txt :   lhsm S3 error log

agent.txt : agent config file

lhsm-plugin-s3 : S3 plugin config file.

 

 

 



 Comments   
Comment by Mangala Jyothi Bhaskar [ 17/Mar/17 ]

Adding another attachment : EndpointVerification.txt 

This attachment shows that I was able to verify my endpoint and connectivity to this bucket on the same host without errors. However, connecting to the bucket using lhsm S3 plugin fails. 

Comment by Michael MacDonald (Inactive) [ 17/Mar/17 ]

Hi, thanks for reporting this. To the best of my knowledge, you are the first to try using Lemur with Ceph. Based on the configs and logs you've provided, I don't see any obvious problems. We'll need to set this up ourselves to try to reproduce it and hopefully fix whatever incompatibility is causing the issue.

Comment by Mangala Jyothi Bhaskar [ 17/Mar/17 ]

Thank you for the response. Looking forward to it. Let me know if I could help make the process quicker/easier in anyway. 

Comment by Michael MacDonald (Inactive) [ 17/Mar/17 ]

Jyothi, is there anything interesting in the ceph server logs? HTTP 400 means "bad request" but that could mean lots of things.

Comment by Mangala Jyothi Bhaskar [ 19/Mar/17 ]

I have attached ceph rados gateway server log. I couldn't really decipher timestamp from Lemur since it was around 3pm Friday the 17th but Lemur logs show a different time stamp. I don't believe the clock on this node is messed up either .Anyway I picked the logs from ceph gateway around this time frame and couldn't get much of a hint. 

Attachment : radoslog.txt

Comment by Michael MacDonald (Inactive) [ 20/Mar/17 ]

OK, so I have some updates.

First of all, I have found an issue with the default rgw frontend (civetweb) and the aws sdk we're using in Lemur: https://github.com/ceph/ceph/pull/7675

The good news is that I was able to get past this problem by switching to an apache frontend as documented here: http://docs.ceph.com/docs/jewel/man/8/radosgw/ (it's worth pointing out that you can use the unix domain socket config as support for that has been backported in centos/rhel 7).

Unfortunately, there seems to be some other incompatibility that I am now investigating. I was able to archive a file, but restore is failing. I'll update when I've figured that out.

Comment by Michael MacDonald (Inactive) [ 20/Mar/17 ]

I have pushed a fix for the restore issue. I'm still working on other compatibility testing and I've discovered a problem with multipart uploads. It seems that files above some size threshold trigger an attempt to use multipart upload by the SDK, and rgw does not implement this feature.

Comment by Mangala Jyothi Bhaskar [ 21/Mar/17 ]

Thank you for the updates. Do you suggest I switch from civetweb to apache web as well? 

Comment by Michael MacDonald (Inactive) [ 21/Mar/17 ]

The version of civetweb in Ceph Kraken may fix the problem, but I haven't tested it. If you want to stick with a LTS release of Ceph, then I think switching to apache for your radosgw frontend is probably the best bet. I suspect that it will be more performant and flexible in configuration, as well.

FYI, I pushed another commit last night which enables a workaround for the multipart upload problem in rgw. You can set upload_part_size = 5000 (5GB) and force a single PUT. The downside is that you won't be able to archive files > 5GB in rgw with this workaround. I hope to find a better solution today or tomorrow.

You can download development RPMs built for el7 here: http://lemur-release.s3-website-us-east-1.amazonaws.com/devel/0.5.2_6_g8186b94/

Comment by Mangala Jyothi Bhaskar [ 21/Mar/17 ]

Thank you for the quick turnaround. Sure, you are correct , I'd rather switch to apache than upgrade ceph at this point in time  In the meantime I will download and test the new rpms and let you know.

Comment by Michael MacDonald (Inactive) [ 21/Mar/17 ]

Hi Jyothi.

I have some news... After digging around quite a bit, I decided to try updating the version of the AWS SDK we're using in Lemur. Turns out, the problems with civetweb and multipart uploads are both fixed with this update. You can use either frontend; I have confirmed that it works fine with either.

If you'd like to try a devel build, please use these RPMs: http://lemur-release.s3-website-us-east-1.amazonaws.com/devel/0.5.2_9_g7d31e58/

We'll be tagging a release soon (later this week).

Comment by Michael MacDonald (Inactive) [ 22/Mar/17 ]

FYI, we have tagged 0.6.0, which includes the fixes mentioned previously. You can grab RPMs from here: http://lemur-release.s3-website-us-east-1.amazonaws.com/release/0.6.0/

If you have a chance to test with this new build, we'd appreciate a confirmation that it fixes the problem you've reported so we can close the ticket.

Comment by Mangala Jyothi Bhaskar [ 22/Mar/17 ]

I was held up working on another project. I should be testing this and updating you soon. Very good to hear that you all the issues that were seen are fixed, When you say "problems with civetweb and multipart upload" do you mean I can continue using civetweb and still have this working?

Comment by Michael MacDonald (Inactive) [ 22/Mar/17 ]

Correct. In my testing with the updated build, Lemur was able to successfully connect and perform upload/download operations against rgw using both apache and civetweb frontends. At this point, I don't think it really matters as far as Lemur is concerned.

Comment by Mangala Jyothi Bhaskar [ 22/Mar/17 ]

It is great that I can connect to S3. I do not see errors anymore.  "2017/03/22 21:40:38 lhsm-plugin-s3-stderr DEBUG 16:40:38.468090 dmclient.go:324: Registered archive 2, cookie 1
DEBUG 16:40:38.468108 mover.go:65: s3-2 started" 

 

However, probably something I might not be doing the right way with archival, I have a 10G file on lustre and I do an "lctl hsm_archive ten.bin" or "lhsm archive ten.bin", when I do so I'd expect to see active logs on the lhsmd daemon since I have it running in debug mode, saying it received an archival request. I do not see this. So I went on the MDT to check if there are active requests. I see that my agent is registered but it shows no active requests. Do you know what I might be missing? 

lctl get_param -n mdt.*.hsm.agents
uuid=62c11daa-586d-25d2-421b-d8cbeca39e53 archive_id=ANY requests=[current:0 ok:0 errors:0]

Comment by Michael MacDonald (Inactive) [ 22/Mar/17 ]

What do lctl get_param -n mdt.*.hsm.actions and lctl get_param -n mdt.*.hsm.active_requests show?

From what you've shown in your comment, I don't see any indication that you're doing anything wrong. You don't have any other copytools registered with the coordinator, do you?

I don't have any experience with the version of IEEL you're using, so I'm not sure if there are any known issues with HSM in that release. bfaccini, do you have any input on this?

These are the same servers running Lustre 2.7.15.3 that you reported problems with archiving 30k small files in LMR-3, correct?

Comment by Bruno Faccini (Inactive) [ 22/Mar/17 ]

Michael,
The "current:0" counter seems to indicate that the CDT has not attempted to forward any request to the registered copytool.
And I am not aware of any problem in this communication path with EE 3.0. So the actions/active_requests dump that you have already requested will be interesting to see.
I also think that the MDS messages/syslog could help to identify unexpected problems.
"lctl get_param -n mdt.*.hsm_control" should also be a good start point.

Comment by Robert Read (Inactive) [ 23/Mar/17 ]

Please also check the current state of the file with:

lfs hsm_state ten.big
lfs hsm_action ten.big

Comment by Mangala Jyothi Bhaskar [ 23/Mar/17 ]

To answer some of the questions,

1) yes it is the same IEEL3.0 system that I was using for the POSIX plugin (servers running Lustre 2.7.15.3)

2) Yes there are no other copytools registered with the HSM coordinator other than this one and I verified the same

3) lctl get_param -n mdt.*.hsm_control  shows "enabled"   ( I am able to successfully archive using POSIX plugin and I do see the logs on the session with lhsmd daemon , but I am unable to see those logs with S3 , should I not expect to see them with this plugin ? )

4)   lfs hsm_state ten.bin :   ten.bin: (0x00000009) exists archived, archive_id:1  ( This is the same file being archived in NFS using posix plugin since archive id = 1 and for the S3 plugin I have set archive id = 2. So if the file were archived, I should be seeing another archival with archive id =2 correct?  I should be able to archive the same file in 2 different tiers? )

5) lfs hsm_action ten.bin : ten.bin: NOOP  ( Looks like it did not attempt to archive ) 

6)  "lctl get_param -n mdt..hsm.actions"  as well as " lctl get_param -n mdt..hsm.active_requests" : Return nothing

7)  From the MDS syslog or dmesg the only log I see from today ( when I re-ran the test ) is "[Thu Mar 23 10:37:19 2017] Lustre: HSM agent 62c11daa-586d-25d2-421b-d8cbeca39e53 already registered" 

 

Please let me know if there something else I could check or verify.

Comment by Michael MacDonald (Inactive) [ 23/Mar/17 ]

Hi. So the current design of Lustre HSM does not support multiple archive tiers for a given file. The reason you are not seeing the request come through to the s3 mover is because the file has already been marked as archived in your posix archive.

You can specify which archive you want to use for an archive operation using the --id N flag, but if the file is already archived any subsequent archive operations will be NOOPs. You can remove the file from the posix archive (lhsm remove ten.bin) and then re-archive it using --id 2 in order to get it into the s3 archive.

Comment by Mangala Jyothi Bhaskar [ 23/Mar/17 ]

Ah.. I see. Thank you for letting me know. I will either remove this file from posix archival or create a new one and try it out. As of now I am re-running the 30k file test with lemur0.6 to see if I can still reproduce the issue. So in a couple hours or so I should be able to re-test S3 and send out an update.

 

When you mentioned multipart upload, do you mean parallelism within a single file? Meaning when you use multiple threads for a single file, is the file divided in chunks and shared among threads? I thought this was NOT a feature with POSIX plugin either. Correct me if I am wrong. Once I have a successful archival and restore of a file using S3 , I will move on to multiple files scenario like I am currently doing with the POSIX plugin.

 

Also in the meantime, I'd like to get my concepts clear with "handlers" and "threads" and how they would work with single file v/s multiple files. Is there some detailed documentation I can refer? Thanks!

Comment by Mangala Jyothi Bhaskar [ 23/Mar/17 ]

While I still have those questions from my previous comment regarding multipart upload, parallelism within a single file and "handlers", " threads"  I have an update on the S3 archival. 

I created a new file and re-ran archive, and this time I did see incoming request on the lemur debug logs ( which is great ), however this seems to say that archival failed 

 

DEBUG 13:50:23.605717 agent.go:152: handler-26: incoming: AI: 58ccc5b6 ARCHIVE [0x2000088d1:0x7537:0x0] 0,EOF []
ALERT 2017/03/23 18:50:23 /tmp/rpmbuild/BUILD/lemur-0.6.0/src/github.com/intel-hpdd/lemur/cmd/lhsmd/agent/agent.go:176: no handler for archive 1
2017/03/23 18:50:23 id:1 fail 58ccc5b6 [0x2000088d1:0x7537:0x0]: -1
2017/03/23 18:50:33 archive:1 total:2 queue:-1 0/0/0 min:89.87µs max:770.006µs mean:429.938µs median:429.938µs 75%:770.006µs 95%:770.006µs 99%:770.006µs 99.9%:770.006µs

 

Also on the MDS end I see " lctl get_param -n mdt.*.hsm.agents" 
uuid=bdba0062-7a60-607e-d97f-0921a9ea48e7 archive_id=ANY requests=[current:0 ok:0 *errors:1]*

 

There are no new logs on dmesg of MDS server though. 

Comment by Michael MacDonald (Inactive) [ 23/Mar/17 ]

Multipart upload to S3 is a feature provided by the AWS SDK. There is some per-file upload/download concurrency happening, but it is confined to a single data mover instance. It's mostly an implementation detail rather than something that we'd advertise as a feature. Lemur does not currently support per-file parallelism across multiple data movers, but we may add that to the roadmap at some point. The POSIX data mover does not do any per-file parallelism at this time.

Handlers and threads are functionally the same thing in the Lemur. We should probably clean up the documentation a bit on that. A given data mover can be configured with N handlers/threads, and that determines the number of concurrent HSM actions the mover will handle.

Comment by Michael MacDonald (Inactive) [ 23/Mar/17 ]

Did you specify --id=2 when you archived? By default, lhsm uses id=ANY, so it looks like maybe it tried to use archive 1.

Comment by Mangala Jyothi Bhaskar [ 23/Mar/17 ]

Thank you for clarifying the parallelism part. I get it. 

Regarding handlers and threads I probably still have some confusion. I thought handlers were like leader threads or threads that just assign archival jobs to worker threads and not take part in the archival itself. Could please may be give me an example to understand how it is designed/meant to work? 

 

My bad, you were correct about specifying archive id = 2 in the archive command. I was thinking that it would be totally handled on Lemur side, but I understand HSM has to input correct info to Lemur for it to work. Now the 10G file has been archived successfully, 

2017/03/23 19:06:44 lhsm-plugin-s3-stderr DEBUG 14:06:44.164968 mover.go:123: s3-2 id:1 Archived 10737418240 bytes in 2m21.330191021s from .lustre/fid/[0x2000088d1:0x7539:0x0] to
DEBUG 14:06:44.165491 agent_action.go:191: id:1 update offset: 0 length: 10737418240 complete: true status: 0
DEBUG 14:06:44.165504 agent_action.go:194: id:1 completed status: 0 in 2m21.331095053s
2017/03/23 19:06:48 archive:2 total:1 queue:0 0/0/0 min:2m21.343477855s max:2m21.343477855s mean:2m21.343477855s median:2m21.343477855s 75%:2m21.343477855s 95%:2m21.343477855s 99%:2m21.343477855s 99.9%:2m21.343477855s

 

To verify I even checked the contents of this s3 bucket and I see it. 

python s3contents.py
hello.txt 12 2017-03-01T17:56:20.280Z
archive/o/e3be20a8-0da0-4dcd-838d-02a2d59d9e57 10737418240 2017-03-23T19:06:43.926Z

 

 Restore works as expected as well. 

 

Thank you for working with me through this. I should be trying out multiple files. Not sure if there could be a limitation there from the HSM end or S3 end. Probably lemur woudn't really have an upper limit to number of files being moved.

Generated at Fri Feb 09 23:54:17 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.