User Details
- User Since
- Dec 16 2014, 10:22 PM (525 w, 5 d)
- Roles
- Disabled
- IRC Nick
- cmjohnson1
- LDAP User
- Cmjohnson
- MediaWiki User
- Unknown
Mar 30 2023
There doesn't seem to be a raid controller
Mar 29 2023
@wiki_willy all 3 of these servers are well out of warranty (2-3 years). analytics1068 is marked failed in netbox
@wiki_willy ms-fe1013 and thanos-fe1004 both installed but did not set puppet certificates correctly and now they both just fail when I try to install --new or --no pxe
This is showing 6 disks failed. Is it possible there is a different problem that is causing the disks to fail? I do not see any errors for the raid controller
Mar 27 2023
The DIMM has been replaced, I updated the idrac and bios while it was offline.
Mar 23 2023
@Marostegui @jynus I apologize for the delay for this DIMM, Dell had a question that needed responding to and it's delaying the shipment. It should go out today.
@Marostegui the disk has been replaced, I did not add it back to the raid configuration. Please do so at your convenience.
Mar 21 2023
Acknowledged, will investigate and update task.
DIMM has been ordered through Dell
A new SSD has been requested from Dell.
Mar 20 2023
I am not able to do the initial installs, fe1013 and 1014 fail immediately, maybe there is a dhcp error and thanos-fe doesn't get a lease
Mar 16 2023
Mar 13 2023
@MatthewVernon the raid configuration states "\Partitioning/Raid: Same as existing ms-fe hosts" Can you be more specific, is this h/w raid? there is a controller how do you want me to set the disks up? thanks!
Mar 9 2023
@MatthewVernon working on these now, I will let you know if I run into any blocks
This server is out of warranty, I am not sure if we have any spares or a battery we can swap from a decom host. I'll update the task with more info after talking with @Jclark-ctr and Willy
issue turned out to be no issue, resolving the task
Failed install but I didn't change the raid controller.
@RhinosF1 Do I still need to troubleshoot the BBU or is no longer needed?
I received an idrac error on 3 of these hosts, I confirmed with Jeff that he is not able to access the host. I am going to try and update the idrac firmware
Mar 8 2023
The disk has been swapped and back online. I am resolving this task and creating a new one for the BBU.
Mar 7 2023
Mar 6 2023
Mar 3 2023
updated network switches
We can replace the BBU, let's get the disk replaced first and then create a new ticket for a BBU
Mar 2 2023
This is most likely a bad cable, I will fix today
The server password was not set correctly, fixed and you should be good to go.
Submitted a ticket with Dell for a new HDD.
Jan 20 2023
Jan 18 2023
Dec 14 2022
@Jclark-ctr Can you try reseating the nic if that is possible
Jclark-ctr I am also getting a media test failure on logstash1036, the DAC cable may be plugged into the wrong port.
@Jclark-ctr I am getting a media test failure for logstash1037, can you check the cable please
Dec 13 2022
@BTullis yes, if you want to recreate the raid manually then please do.
Dec 9 2022
KS1002 was installed without an issue, I started over with KS1001 but the mgmt IP address changed and the provision script didn't work. I asked @Jclark-ctr to manually change when he gets an opportunity
@Papaul When I try to image these servers, the process fails immediately. This is the error I receive. Any ideas on what is wrong?
Dec 7 2022
completed
Dec 6 2022
@Jclark-ctr Can you get with @ayounsi regarding this, it could be an optic that needs to be replaced.
@BTullis these servers are ready for you to image. BIOS/Network and firmware have been updated. I updated site.pp as well
Dec 1 2022
Nov 8 2022
@Jclark-ctr I did the netbox provisioning script, I am not ale to ping the mgmt IP for either server. Can you verify that the mgmt cables are connected?
Oct 27 2022
I added these to netbox but when I ran the dns script and home, nothing changed.
@Jclark-ctr can you look at kafka-logging1005 and make sure the network cable is connected and the right port. Sorry to bug you on this but the install script fails immediately after typing go and the mgmt password.
@Ottomata yes, that is what's happening here
@Ottomata this is failing in the installer because of the raid configuration. I probably do not have it set correctly. Can you give me the specific hardware raid config? Ex: 2 ssds raid 1 and larger disks raid 10. Thanks!
The mgmt links are still not working, The DNS is correct but I am unable to ping the servers.
Oct 21 2022
The dns has been updated but I am not getting any mgmt connection, I need to check to make sure the mgmt cables are connected.
Oct 19 2022
@herron for raid setup, are all the disk raid 50? I do not think that the OS will install with that setup? There are 8 750GB SSDs
these are updated with kafka-stretch1001 and 1002
@BTullis these have been fixed, I updated the nic firmware and re-ran the image script.
Sep 29 2022
updated their status.
Sep 28 2022
@Jclark-ctr can you verify that mgmt cables are connected to these servers please?
Also, @Jclark-ctr please check the network cables are in the correct port. 1024 is giving me a cable failure
@Jclark-ctr can you verify the port for kubernetes1023, looks like something is already in c6/port 36
Sep 27 2022
@Joe all yours, figured it to be the same partman recipe as memcache
@Joe which partman recipe do you need for these?