Current Opportunities >> Network Reliability Engineer
Network Reliability Engineer
Summary
Title:Network Reliability Engineer
ID:2832
Location:Remote
Description

Network Reliability Engineer
Remote - Successful candidate must live in a state in which Point Broadband operates (AL, FL, GA, MD, MI, NY, OH, TN, TX, or VA) #LI-remote

 

Summary:

Point Broadband is seeking a Network Reliability Engineer to help ensure the stability, scalability, and operational excellence of our production network. This role focuses on the reliability and operational ownership of core routing and switching platforms across our multi-market environment, including market core/edge and backbone connectivity where applicable.

The engineer in this position serves as a senior escalation point for complex routing and switching incidents affecting production services. Operating within a Network Reliability Engineering (NRE) model, this role emphasizes strong troubleshooting, operational discipline, and continuous improvement across a network built through both organic growth and acquisitions.

In addition to resolving critical issues, this engineer helps drive operational maturity through Operational Readiness Reviews (ORRs), root cause analysis (RCA), and ongoing improvements to monitoring, documentation, and operational runbooks.

Note: This is a production reliability role focused on operational ownership and incident response (not purely an architecture or deployment role).

Duties and Responsibilities:

Essential duties and responsibilities include but are not limited to:

  • Act as a senior escalation point for complex routing and switching issues impacting production services, lead triage and restoration during critical incidents.
  • Operate and support core routing/switching environments across market cores/edges and backbone connectivity; validate stability, redundancy, and performance.
  • Participate in on-call rotation and provide after-hours escalation support for critical network events.
  • Lead or contribute to RCAs for major incidents and recurring issues; drive corrective actions that prevent repeat outages.
  • Improve observability and operational standards through better monitoring, logging/telemetry use, and alert quality.
  • Partner with the NOC to improve operational outcomes through knowledge transfer, runbooks, and troubleshooting standards that reduce escalations.
  • Participate in ORRs for new systems, capacity upgrades, and major changes to ensure production readiness (monitoring, documentation, rollback plans, and support requirements).
  • Define support requirements for new network technologies and participate in roadmap planning.
  • Document designs, operational standards, and change procedures in a clear, reusable format.
  • Maintain punctual, consistent attendance.
  • Other duties as assigned.

Qualifications:

To perform this job successfully, an individual must be able to perform each essential duty satisfactorily. The requirements listed below are representative of the knowledge, skill, and/or ability required. Reasonable accommodations may be made to enable individuals with disabilities to perform the essential functions.

Education/Experience:

  • Bachelor’s degree in Computer Science, Electrical Engineering, or a related field; or equivalent military/industry experience.
  • Experience operating and troubleshooting production ISP/service provider networks or large-scale enterprise networks with comparable complexity.
  • Proven experience in a Tier 3/Tier 4 escalation role supporting complex routing and switching incidents in production.
  • Strong communication skills and the ability to collaborate effectively during incident response across engineering, operations, and vendor teams.

Technical Skillset:

  • Hands-on experience implementing and troubleshooting BGP and at least one IGP (OSPF and/or IS-IS). Strong understanding of routing policy, convergence behavior, and failure-domain isolation.
  • Switching: Strong Layer 2/Layer 3 fundamentals including VLAN architecture, trunking, LAG/LACP, MTU, QinQ, and practical troubleshooting of spanning-tree and adjacency issues.
  • Demonstrated ability to diagnose customer-impacting network problems using monitoring/telemetry/logs, device-level tools, and structured troubleshooting methodology; comfortable owning restoration plans during outages.
  • Working familiarity with technologies commonly found in ISP environments such as DHCP, CGN, IPv6, multicast, and QoS (depth can vary by area but must be comfortable supporting production issues).
  • Experience configuring and troubleshooting Juniper and/or Arista routing/switching platforms in production.
  • Experience supporting legacy or multi-vendor environments including Nokia 7750, Ciena 5164, Ericsson 6675, and/or MikroTik CCR platforms preferred.
  • Familiarity with automation/scripting (Python, Ansible, etc.) used to improve reliability or operational efficiency preferred.
  • Certifications such as CCNP/CCIE, JNCIP/JNCIE, or equivalent practical experience preferred.

Example Engineering Challenges:

Engineers in this role regularly work on problems such as:

  • Diagnosing routing convergence issues impacting multiple sites/markets.
  • Investigating intermittent packet loss, latency, or microbursts across core paths.
  • Troubleshooting asymmetric routing, policy issues, or traffic engineering anomalies.
  • Stabilizing legacy designs inherited through acquisitions while driving standardization.
  • Supporting capacity expansions and routing policy changes at market cores/edges.
  • Improving monitoring, alert quality, and runbooks to reduce NOC escalations.
  • Completing RCAs after major incidents and driving actions that prevent repeats.

Physical Demands/Working Conditions:

The physical demands described here are representative of those that must be met by employees to successfully perform the essential functions of this job. Reasonable accommodations may be made to enable individuals with disabilities to perform the essential functions.

  • Regularly required to talk, hear, and communicate effectively.
  • Required to use close vision, see colors, and be able to focus.
  • Must be able to perform computer functions and operate routine office equipment.
  • Must be able to perform moderate walking throughout Point Broadband’s facilities.
  • Must be able to sit or stand for extended periods.
  • Must be able to physically lift various equipment up to 50 or more pounds.
  • Must be physically able to work in tight, confined spaces, such as ceilings and behind server racks in close quarters.
  • Travel to other offices may be required on an occasional basis.
  • Must be available for emergencies post normal business day working hours, including weekends as well as participate in an operational, on-call rotation.

Benefits:

  • Medical (3 plans to choose from), Dental and Vision
  • Short Term Disability
  • Flexible Spending Accounts
  • Company Paid Life as well as Voluntary policies
  • 401(k) with generous company match
  • Paid Time Off
  • Share the Care Paid Time Off
  • Paid Holidays
  • Uniform/Boot allowance *Applicable by Position*
  • Cell Phone Allowance *Applicable by Position*
  • Career Progression Opportunities
  • Discounted Broadband Services *Where Applicable*
Point Broadband is an equal opportunity employer. For further information, please visit:
   EEO is the Law Poster  |  U.S. Equal Employment Opportunity Commission

 

ApplicantStack powered by Swipeclock